Matt Walters - How to CQRS in Node: Eventually Consistent, Unidirectional Systems with Microservices

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right so I'm gonna start off with uh with a couple of questions did anybody attend last month's meet up and see my lightening talk we got one all right we have one our retention is one which is it's okay yeah so for Pat do I happen to know the concept that went over there sort of like a prelude to some of the things that we're going to go over here and the title of my talk is how to CQRS in node it's eventually consistent architectures that scale and globe grow blah blah blah but one more question does anyone know or is anyone familiar with CQRS okay so yeah so there's a there's a few hands so CQRS is a type of distributed architecture all right so it's a design pattern but typically we talk we talk about design patterns for things that happen within a single process whether it's like the factory pattern for creating instances of things or an adapter pattern for wrapping an interface stuff like that CQRS is often described in ways that are really ambiguous because you have to kind of build up some blocks to make a distributed system and so you in order to explain this thing you really have to kind of give like a high-level overview so people kind of get a little bit of an understanding of what it is break it down into the different blocks so how they're related and then bring it back up and be like okay that thing that we just talked about for the last like 45 minutes that's what CQRS is and it's been a kind of a tough thing to explain in the past my goal for today is to make you guys walk out of here saying I know what CQRS is and that's cool and I think I can do that sound good all right let's do it so how to seek urs in node eventually consistent architectures that scale and grow I could have just as easily left out seek URIs and said how to create eventually consistent distributed unidirectional systems but I didn't for little letters is easier to do so sort of Who am I I am a guy who's been coding for a long time at one time I was a dotnet consultant I was a principal consultant of this company called magenta canned I got it sort of a back in domain-driven design and things that kind of led me did these types of patterns I went on to start a company I was a techstars company we were in Seattle we moved that company to New York and when I was building that company that's when I decided hey there's this thing called nodejs maybe I'll try that out and see if we can make things faster than we couldn't dotnet and anyway I didn't want to hire dotnet developers so I tried it out and turns out it was really easy to make these distributed systems today we would use the word micro services to describe the components of this system but back then we're just services but we had a goal of always making our services small do one thing and do one thing well and that's what micro services should be so that company was an online marketing company it's called go chime I eventually left the company did some consulting but my co-founders an amazing job actually sold the company a month and a half ago in the same software that we open-source that we built there and we open-source I went on to apply at number of different companies and it's turned out that it works in all sorts of different problem domains and so it worked in an online marketing system that was churning through the Twitter firehose in real time and it even powers an electronic bond exchange basically a stock market for bonds 1.75 billion dollars has traded over this thing it these tools allowed us to architect our system in a way that you could point different people at different services say you own this thing and then as your system grows it's very easy to figure out where do we plug in some new functionality so it's all open sourced and other companies reach out to me to letting me know that they're that they're using these tools we're actually creating a new site and actually a tool called micro service kind of a land grab of a name but what I going to do and it's micro SVC so in the next coming once you're going to see something that takes all these ideas that we're showing now and makes it even easier to use so let's dive into it so that's me actually that's me before I decided not to cut my hair for seven months so that actually is me but so when do you seek your s CQRS allows you to easily build a real-time reactive systems and this pattern has been around since before reactive was even a keyword or microservices was a keyword so this is a pattern of ideas that we can apply to any new tech as it comes out you use it when you want to prefer small modular services that do one thing well you use it when you're building a system that you expect to grow over time you're starting a new business you're just discovering something about this problem domain you're going to learn things you're going to start up a new service it's going to do some thing with orders or shipping or whatever but it's going to do that one thing well you're going to learn more you're going to add new functionality to your business pop in a new service it's super helpful and that it makes it really easy to organically grow your codebase over time such that when now your code is too big for one team you can start splitting your teams easily and it's very easy to subdivide the system out to to the different individuals in your organization it also stresses that every single different component in your system can be scaled differently and we'll see what that means in a bit for me personally the answer for whether you not use CTRs is pretty pretty small like you're not going to use it for a static site you're not going to use it for some you know crud application where you're just doing update you know create update delete but anything larger than that I would recommend that you use CQRS or you at least do sort of a halfway step and if we have time at the end of presentation I'll show you how I did a halfway step to make the NYC know.com site so here we go so what is CQRS pump the brakes a little bit let's actually not say what secure us is that's the next slide we're going to do a primer so there's this dude named bertrand meyer and in 1988 he was developing the eiffel programming language which I'm sure none of you use because no one uses it it's totally academic if you have ever used it was probably in a language concepts course in school but it's pretty much like the reference implementation for object-oriented programming languages and he came up with this sort of this rule called command query separation that says when you're designing objects are the types that define objects any particular that you're implementing should either be a command that performs an action an update state or it should tell you what the state is never the same thing and you can understand why like if telling something or if asking for the state of something changed its state you could never really know what the state is right so it's like a cat the cat in the Box you once you touch it it's changed so that leads us to make these class diagrams like this so you should always have your your methods on some type of an object either do something or get something and they should never do both okay so see QRS is sort of an expansion on that take that idea from the object level expand it to an architectural level and basically what it says is the externally facing subsystems your api's your apps they should get information from one location queries and they should they should update information by sending commands to another location right so your commands and your queries are coming and going to different places okay so we'll talk more about what that like how that actually happens but that's the general thing no longer are we updating and asking for the state from a single database that's basically what that means and so this is just sort of going over that last point that by saying by making that choice saying that queries and commands go to different places we're no longer going from one database and that's going to have some implications that play out through our entire architecture just from that one decision so what a CQRS looks like this is basically the simplest example of a Secura system that you can have so we've got a web client served by a web app we've got a back-end micro service we just call it service now but we can pretend it's like an order service right so this is like Amazon v2 after they took it from a monolithic app and we're like you know what we're going to grow so that's the order service and you can imagine that the web client when you click a button it sends a either a get or a post to the web app in a normal monolithic app this thing would just go and update something in a database and then immediately send back some response but what we're going to do is go the other direction we're going to send a command to the order service and say order service in order were saying update your state he's going to admit an event when he's done so he takes that event updates his internal logic notice he has his own little database that database only stores the data he cares about for his part of the system and that's orders right so if this was like a MongoDB would literally have one collection in it is the order's collection and when he's done he emits an event and that event goes over some sort of a transport whether it's a queue or whatever and some other process or any number of processes that care about that event listen to it and say ah someone's told me in orders been created I want to do something now and it's up to that service he receives it to determine what do they do this is a special service and we're going to sort of like this is going to be a bit of a little mystery service and we're going to talk about him a lot this D normalizer we're going to spend a lot of time talking about how these services that sit below him are like an order service or a fulfillment service or shipping service and we're not going to exactly say what D normalizer is except that he listens for the events coming out of the other services and writes them to the horizontally scalable database in a way that's easiest for the UI to view okay so we'll get back to that and so you can see that our command and the result of it end up making it back to our web client and commands go to the services which emit events and those are separate from where we do our queries and because we made that decision we don't read and write from a single DB we just read from it we now have a unidirectional flow through our architecture and one that's eventually consistent in the middle of one single nanosecond at some point in time the order services database is not going to be the same as the D normalizer database the one that our web application reads from we just decide that's okay because we're going to get a bunch of benefits from from accepting a little bit of delay in time okay and right now there's one little vent coming over here but you're going to see this pop up and a couple slides that event that pops up as a result of a command we call that the dance and that has to do with what this distributed architecture is called over here and we'll get to that so what does a larger system look like you might have multiple external facing and the unidirectional flow this patterns going to end up looking about the same an outside facing application or an API will send commands to tell the system to update at state the recipient of that command is whichever commits whichever service handles that type of command so this is the order service again we're saying create order he's going to actually create an order store that in his local database transactions now complete and he's going to tell the world my order has been created he publishes an event and now any number of services can listen to that the denormalized listens to it because we want to update our whorls about our horizontally scalable database and get that back to the web app right but there happens to be another service here that listens to maybe that's a fulfillment service so he listens for when orders are created so that he can then dispatch them if he talks to UPS or the Postal Service or whatever right but that's his job he knows how to do that one thing well and so this just sort of goes and plays on top of itself now the Web API can also make calls and he may call some some user account service maybe a partner has set up a new account and so it goes and instead it does its logic and then it emits some events and so the other services care about that that the normalizer does because he's going to update the horizontally scalable database right so our UI can read it but the other services might care about that because they're going to use some user information when they process things like shipping right so it's a unidirectional flow it's eventually consistent and now we considers more than one event over here that's starting to comprise this thing called the dance and what am I talking about when I save the dance so we can see that as commands come into the system into our domain logic gets performed and then events get emitted telling other services this chain reaction happens right and so all of a sudden this this interesting like organic play of command causes a series of events from from a number of different services begins to play out alright there's an architectural design pattern that describes this is called choreography and the easiest way to describe a choreography is to say what it's not which is an orchestration so in a distributed system if you have a broker in the middle whose job it is to handle a workflow and constantly tell one step at a time services do this wait for a response then till the next service do this and orchestrate those things such that some distributed work looking can be done but that's called an orchestration the problem with that is you have a single point of failure and you also have a single point of scalability issues when your system starts to grow what CQRS uses is a choreography which says actually the decider of where events get published is within every single service so imagine you're in Amazon and Amazon just brought on some new team which is like the drone shipping team they want to know which new orders have been created such that they can ship them out instead of having to communicate with some team that does the central broker thing they just spin up a new service have it listen for orders when the orders come in to that service it checks a property that says like drone equals true and they know that they're the ones who are going to deliver that thing and so it's super easy for them to get started they know how to shoehorn their service into the system and now it just organically scales out and so the series of events that come after a command are this choreography this dance okay so we have a couple of different colors here for our lines we have the yellow straight line and the yellow dotted line just as a like quick little key yellow lines mean messages they're messages going over some sort of a transport whether it's like RabbitMQ or Kafka for Message Queuing something like that right the commands are they're both messages but commands are sent directly to processes and commands tell I process what to do and this is just sort of an Amin creature that we decide on it's going to help us build this system a lot more easily so commands tell a service what to do they're sent directly to a receiving service it's usually from one service to a queue and then that next service picks things up off the queue they're sent fire-and-forget and the language that describes them is usually present tense directive so it's like order create write events tell the world that a service is done doing some action so the command came in told it to do some action the logic happened things were saved to the database we know we're 100% done publish an event and tell other services that you've completed these things right and so it's done in a broadcast faction and getting back to sort of like the Naaman klaich er we usually name these with past tense because again we're describing the fact that we just perform some some logic in our service okay when topic talking about any questions so far cool so when you talk about a CQRS system you know we started with this is like our second glance at what a larger secure system looks like remember one of the things I said is it's a bit of a big thing to understand what a secure system is but what we're going to do is learn the ideas and the principles so we start with this big view and we're going to break it down and we're going to build a backup in our head so we understand it all so when you're talking about a secure system you typically have the front end and back end services front end services are user facing or maybe customer facing like third-party vendor so they can be a web app that's feeding you know some JavaScript UI they can be a web api that's feeding some mobile app or they can be some sort of a third-party gateway that goes out to one of your client providers and then you have back-end services these are the things that do what your business does right so if you have a bank there's going to be some service that runs a ledger things like that okay for now we're going to start with that big picture we're going to break it down and we're going to start with the front end so one of the questions is what's different about a web application that exists in a CQRS architecture versus like a monolith that just talks to in writes to a database well the first thing that's different is although we still query the database to get the state right in a monolith you still query the database if you go and hit you know some meteor reference app it's just going to say hey what's the current state of the database and it shows it to you still the same the difference is we don't directly modify that database we get that database we get the data from the data and then once we want to modify the state we tell someone else to modify it on our behalf okay and the way that we do that is we send commands and the way that we just that we are explaining so far and so apps expect that the read-only representation of the state will eventually make it around right so we talked about how it's unidirectional eventually consistent client-facing apps trust that after they tell a back-end service to do something that the result of that will make its way all the way back to the denormalized database and one of the reasons they can trust that is because we tend to want to implement the commands and the events these messages that are going between services using a reliable transport so rabbitmq Kafka's your mq etc right these are means of sending a message to from one process to another such that say the process crashes or the process happened to be down for maintenance when it comes back up it'll immediately see that there's something on the queue and begin going again so you know that even if some things may be turned off along the way eventually when everything's running as normal each step is sort of like transactionally complete and things make it all the way through the system so let's pick a transport and again you can use Kafka you know any number of different transports I build things on RabbitMQ because it's super reliable so RabbitMQ is an open source messaging broker and it allows you to do direct send between two processes using your queue it also allowed to do broadcast or fan-out in topic routing messaging which means you use sort of key words to describe the different messages that are going out and it will route them to the receiver and so this is how we allow our processes to to specify what events and commands they care about okay super highly performance built in Erlang it's open source honestly like probably at least half of the major financial institutions in New York are rabbit running rabbit and queue somewhere so it runs like exchanges it's it's super-powerful brew install RabbitMQ or whatever you do on your operating system so let's pick a framework so how are we going to use rabbitmq from node let's pick a minimalist framework and make sure that it's not some big unwieldy thing like rails where you just want a piece of it but you get all of it so I wrote this is actually one of the first things I ever wrote in node when I was starting go chime and I wanted to make sure I could build these distributed systems service bus it's super simple messaging in node it's basically the the simplest API you can make on top of RabbitMQ and by simplest I don't mean like leased lines of code I mean to get it running in your process that's where the leased lines of code are so it supports all the different mechanisms that RabbitMQ does direct send pub/sub fan at topic rot reading etc it's also used in financial exchange and online advertising like I said the beginning and will show some code and a little bit about just how easy it is to use so again what's different is what we're talking about right now for a client facing app and so the question is how does this work we've said that we read from the database and instead of writing back to the database we send some command to some outside service so how does that work well this is how it works and this is the simplest usage that you can possibly do using service bus you just require service bus and you instantiate a bus in this particular instance since there's no options parameter being passed into the bus call we're just saying use a revenue expect there to be a RabbitMQ instance on localhost with all the default port and and all that kind of stuff and it will just automatically be ready to use as soon as you do bus send you don't need to do like bus dot unconnected and register stuff it will actually queue that internally so that if your rabbitmq instance is slow to connect as soon as it's all ready it will all get flushed out and I've never seen that become a problem because RabbitMQ is extremely fast at getting getting started up and nodes extremely fast at initializing all this stuff so what we have here is we instantiate our bus and we're doing a bus send call so again RabbitMQ has basically two different types of of sending messages it's direct from one process to another on a queue or it's broadcast fan-out and for the broadcast stuff you can do some special routing but for this case we're sending our command from the web app directly to one service and we'll call it the order service question yeah so the question is what does it mean for it to be reliable how are the messages reliable so rabid mq allows you to make durable queues and also have durable messages on the queues so what happens is you can bring your your processes up and down but the queues will still exist in they're persisted to disk so what happens is unless those messages have been acknowledged they remain on the queue and so when this order service is done and it's saved to its local database that's when he's going to say ACK this message BAM it's off the queue you're done and now you publish your your messages which we'll get to momentarily so again these are fire-and-forget a bus that send takes a command name and then the command itself and so it's going to use rabbitmq to send a command via order dot create and then some service is going to determine I care about order to create and they're going to be the one to listen to it and we'll get to that momentarily how they do it so once you send stuff like this is pretty pretty simple ok like instead of updating a database via a update call we did this thing well then what then we wait how long do we wait depends on how fast our eventually consistent distributed system is but typically it's pretty set it's pretty quick so we wait for our queries to bring us back the information that we need and there's a bunch of different ways that we can get that information if we have a reactive or a real-time system what we'll find is as the message goes through that unidirectional flow and the D normalizer saves it back into the database if we're using something like we're doing up log tailing so the moment that update comes in or an insert and that record changes in it will reactively automatically update our UI so it's really cool to see these systems working where you know that things are going to these back-end processes and your and your front ends are just automatically updating and to see them running in a test runner is even cooler but securest doesn't really care what specific technology you use that's up to you so you could use rethink DB you could do couch TV streaming you could do polling graph QL it doesn't matter because this is an architectural pattern that guides how to put these things together and it gives you the ability to choose at every little point what is best and if you and and one thing to note is there's plenty of instances where product design solves this problem in a way that might sound outdated but you come across it all the time so the product design part here is you should send the customer to a thanks for processing your order check back later for updates page right and then they come in there refresh and maybe some at some point there refresh there's new data one place where you've all seen that is through the amazon.com ordering process before you submit your order Amazon is going to tell you estimated tax and it's also going to have a little a little asterisk next to it saying this is estimate it may actually change as soon as you do your do submit your order you're taken to the page it says thank you and then minutes later your municipal tax and federal taxes all put together and comes to you in an email and within some time you can go and start checking your order pages but clever product design helps you deal with different wait times between your processes most the time you're able to design things that are very quickly but sometimes companies like Amazon for instance want to do a little bit of cya like they don't want their customers to have to demand that everything is already consistent and already known about the order process the moment they've hit the button most customers don't care so again we wait and we're waiting for the backend and we're waiting for the backend choreography to complete the dance that we talked about because it's eventually consistent so we talked about the front-end any questions on that stuff so far all right cool let's talk about the backend so this is what our back-end looks like inside this little box and so remember we said that front end facing services send commands into the backend the back end processes do some stuff and they omit some events maybe that makes the backend processes do more that eventually collects in this D normalizer thing whatever that is we're going to get to it and that's what the backend is but let's break that down even more so let's talk about the perspective of a single service in the backend what does he look like what does he care about and what does he do so these four points are pretty much it and let's pretend this is an order service again so a back-end service can listen for commands and subscribe to events they'll listen and subscribe words basically mean listening for people to send me things via wrap into cue and subscribe is subscribing to broadcast or topic routed messages they perform some business logic so once they get that message they're instructed I've been instructed to create an order so then you this service goes and does the actual processing to make an order that business logic lives inside his codebase and then when that's persisted and done and he's up it's updated it's local state it publishes its events and tells external services about the updated state any questions on that so far all right so what does that look like well using service bus this is pretty much the simplest representation of what you could make and so this is basically a single purpose micro service using service bus we require our bus instead of doing the require service bus I just put that in a different file because you can actually do some configuration of service bus give it some config and things like that so we've made it into a little singleton that's reusable and we have this function called create and that's what does the business logic we don't need to like figure out what that is but for Amazon they've got some code that creates their orders so service bus is going to listen for order create commands that are sent in from front ends and then it's going to and that's going to make it such that anytime a new command is pushed to the service over RabbitMQ that command object is going to be available and this function is going to fire we're able to then pass that to our create the logic goes and saves things when it's done if an error doesn't come back then we can do bust up publish and we do order got created and so now we've gone and we've informed the outside world of the fact that our service is successfully updated some internal state right notice that we're listening for commands we perform the business logic like we said and we also have these reject and ack calls these are important because like we said there's two scenarios here we either create our order or we fail doing so if we fail doing so we still want to get that message off of the queue so we don't have what's called a Poisson message a Poisson message causes your your application to crash it then restarts reattempt that message and doesn't know how to get out of a loop of constantly doing that forever and so we want to have some logic in here that says well if things do fail maybe under certain circumstances maybe it's failed for the fourth time you just go ahead and reject that thing get rid of it log the error and some humans know what to do and then of course if we're successful if we've been able to publish after doing all of our logic we acket it's off the queue what this service needed to do is now transactionally complete so we talked about how a service takes in a command that emits an event and maybe some other services are listening or sorry are subscribing for those events and they may then be motivated to do some internal work because they cared about the fact that an order was created right an order created event was published from some previous service so what does the downstream service look like what's its perspective when it lives inside this back-end distributed system well it listens for commands and subscribes to events it performs some business logic to process and commands it updates local state it publishes when it's done it's pretty much the exact same thing as the first service it just happened to not be the first one that was called in this particular line in the choreography and the dance right we're building up this system over time and as the way that your system is built at first perhaps only one service takes some command to do something because that's the intent of your system when you first built it but over time more sisters's are going to be built that take different commands and they end up dancing around and causing other services to take actions after you've published the event saying that the business logic was completed for that does it make sense all right so what does the sample downstream surface look like and again this is the simplest possible one so the only difference here is that we're using subscribe instead of listen and remember these are single purpose microwave services that we're building right here and so the only purpose of this fulfillment service at the moment is to listen for order created and do some fulfill logic so this one just subscribes to order created which was emitted from our first service that was told to do something by a command and it goes and it does its fulfill logic and then again it either rejects or acts after having published right so we saw the single purpose micro servers is how do you say you don't want to have a system where every single action in your system has to have a new service that you deploy and you have this giant mess in Amazon lambda is an example of distributed systems that are completely functional you have one function right some people might argue that that ends up being really hard to maintain if your entire system is just understanding what functions are listening for what messages at what time you have this completely flat architecture with the number of files representing the number of actions that can happen in your system it can start to get a little bit unwieldy so some people might argue to have systems or services excuse me where related events related related actions and things that are related to domain concepts in your system all happen inside the umbrella of one service and so that's to say we decide that an order service handles everything that happens with orders or we decide that a fulfillment service does everything related to fulfillment and we'll have the listens and the subscribes for the events that drive those that business logic all sitting under the hood of one service so how do you start putting these things together well we could just do bust a subscribe and like list them all out and that gets a little bit that's a little bit wrote instead I've got a modeled service bus register handlers and this is just the smallest amount of convention that you can put such that you initialize this thing on startup and you give it some properties you tell it where this folder is and it'll automatically register modules that define handlers from within this folder so what does that look like so the first part was we initialize this little module at the beginning of our service so we require our service bus instance here we're getting it out of a little little refactored module that we put it into so we can put config into it and then we require service bus register handlers and then we just we just execute that function and passes the parameters that it needs so it wants an initialized bus so that it can then go and figure out what subscribes and listens to do we define some error handling and that's important because we need to tell it how to avoid that poison message scenario and we also want to be smart and and have a unified way across our services of reporting errors so that engineers when errors do pop up they know how to pinpoint this is where it happened in what microservice etc and then we also tell up the folder path to our handlers and so it'll know where to go and look for different modules that fit a certain format and it will then load those up and so what does a handler file look like well a handler file has a number different properties that can be implemented there's a few more than this but typically you implement either a listen or a subscribe which is the the fourth one down you can you tell the you tell rabbitmq through register handlers what routing key to use and so if you're using a listen with a routing key that's saying that you're listening for the command of what about create so just like we saw the listen for ordered create a couple slides ago this is basically that using service bus register handlers okay we've also said a key equals true means this is going to use a durable queue and you can do some special things like naming the queue to make them easy to find when you're trying to debug things and stuff like that but the really interesting thing is this bottom part where we have a callback now previously remember we had to call ACK or reject to deal with whether the message was popped off the queue and noted as a successful transaction or an unsuccessful one so now we're dealing with that on based on whether we just call CV by itself or whether we're passing back error to it and it'll go ahead and perform on that logging that we already described back in our last slide cool so and you know super simple installing this guy npm install' but what about the work part so a lot of times when people talk about CQRS they also talk about event sourcing and those things are like I there they're almost mistaken for one another and it's one of the reasons why CQRS is a bit misunderstood event sourcing is often used with CQRS but CQRS doesn't care how you save things in any particular service the whole point of CQRS is such that every one of these services and in this distributed system can determine am I better off saving stuff to MongoDB or a graph database or a flat file or do I just hold the stuff in memory because I actually get my data from some external source every time I load up but it's up to the implementers of every service to do that in that service in the way that makes most sense there there's no reason to choose one thing and to choose a hammer and use that hammer and every service when one service could be better made with something else so you know some examples of this are you know you need an audit trail are you doing finance like are you building a financial ledger where you're doing an append-only log as a representation of like someone's banking account if so use event sourcing I actually have a framework for that called sourced works great powers of bond exchange again you know you could choose mongus for MongoDB typically when I'm building these systems I tend to choose and I'll either use sourced or all you just use Mongoose everyone knows how to use mongoose entities the only time I ever use source is when I need that sort of like audit trail of my models so again secure us making the assertion of this stuff it's it's all up to you and that's the power of CQRS but wait there's more so it turns out that service bus also has middleware so similar to how Express has middleware where an incoming request comes through and you can sort of set up different functions that can modify an incoming or an outgoing message object you can get the same thing in service bus so for instance this bus dot package middleware which is included in service bus will automatically wrap up whatever object you send it put that as a data property on a new sort of memento object and give you a bunch of helpful parameters on there with the date/time it was sent the service that sent it and the type of message etc correlate its correlate will automatically add a correlation ID so that enables you to know when you're getting logging you can log the correlation ID for the message that started some action within your system so now you can begin to distributed tracing around your entire system and see so one clicked this button on a web application we created a correlation ID that we're going to use to track that action that caused a command to be sent to a service which then caused a chain reaction a choreography of 15 different events that happened but the same correlation ID is used on all of them and so we can use log lis or whatever our distributed logging framework to put these all together and see exactly where something might have gone wrong if there is an error or we can just see that things are going great this retry metal middleware which you can get from NPM install service bus retry that will actually modify the object and put a function on it and override the the ACK and reject such that it will make your service if you fail and you throw an exception and you haven't rejected a message it'll retry it three times and then after the third time if it's still failing it'll put it on to an error queue for people to look at and so the purpose of that is one that helps you get rid of that poison message scenario and it also sort of takes into account the fact that there's always going to be errors and you need to have some mechanism in your system for humans to know when to come and look at that error see what caused the error and then correlate that with some logs and so using service plus retry and having some good error handling in your service bus register handlers definition puts those things together but wait there's even more so we said that you can do the bus correlate and that puts the correlation ID turns out there's also service bus trace which is a utility that takes those correlation Deas publishes them to Redis and now you have a utility that you can use to in real time see your message flows using the correlation I did go through your entire system and debug it I'm working on a front-end UI so this is can all be done and looked at from the web but right now there's this fun little CLI app so recapping on backend services we talked about how from front end facing services commands come in the back-end services process information they save it once it's tries to actually complete on the inside they pop the message off the queue by calling ACK they then publish an event telling the world that they are done with their with their activity and then some downstream service gets it and that and and that gets repeated ad infinitum so service two takes over from service one does its logic etc and all these publishes of events end up coalescing at this D normalizer which saves them into this D normalizer database which is what updates your user interface and so the backend services from their little world perspective of just existing by themselves that listen for commands scratch subscribe to events like we said but at the very beginning I said that we're going to talk about this D normalizer thing and it's going to be this mystery and at some point we're just going to be wondering like what is that thing right so it turns out that the D normalizer is just another back-end service so just like every other back-end service it either listens to or subscribes to events the normalizer happens to just never listen for for commands because it doesn't have any real business logic its sole responsibility is to listen for all the different events that describe things that happened in other services and then write some denormalized UI specific format of that event so that it can update the UI and so that's to say that the way that you save your order and your order service might not be the most efficient way to display it on your user interface maybe it's got 25 different properties and a bunch of relationships the different things and on your user interface you just have an order ID and someone's name and a list of product names right so that you normalize was job is take the event that was that was published out of some service and save just the pieces that we care about such that it automatically will now update the user interface does that make sense cool and so that the normalizer is sort of this key that completes our unidirectional eventually consistent system so that we read from our read only database to begin with and instead of updating that database we told a service who's responsible for some part of our business to do something it did its thing it told other services about it the de-normalize are listened to it and now we've got our information back to the user interface so that the user can see that it was completed and I didn't or eliezer end up looking kind of like this there's all these different handlers from service bus register handlers that are on order cancel created modified and they're just there you in the systems that I build it's typically got database and they're just doing up certs so there's an order ID for these order cancel created and modified and they'll come in and whichever one was last and as a larger version number is going to overwrite and boom your your front-end is automatically updated as a result of the dance the choreography of events so recapping the big picture is the same thing we started with client facing and outer facing services send commands services except those commands performed some internal logic publish when they're done the same thing happens as a result of that in additional downstream services eventually those coalesce they do normalizer we can now query for the updated data if it's real-time we're using up log tailing or whatever it automatically gets there otherwise we design a products that people just hit refresh or come back later or click a link from an email it's unidirectional you can have as many different outward facing apps firing into the domain as you want there's this dance one thing to note also a common question is does there have to be 1d normalizer no there isn't you could have 1d normalizer / outer facing application if you want it to the thing to keep in mind is that these messages are going to be multiplied and so as you have more denormalize errs if they're listening for the same messages you may be doubling the amount of messages that are going through your entire system so the better way to do it is have 1d normalizer and make it horizontally scalable and so one thing I've noticed that CQRS allows you to make important decisions about processes at the point of every single process notice that some of these different processes have single boxes and some of them have multiple so because we have this loose coupling between all our systems we've said we're going to horizontally seal our order service we're not going to horizontally CL already normalizer but we could if we wanted to and so it allows us to make the decision at every different point to have this extremely malleable system and when we onboard a new system a new little green rectangle comes in here it subscribes to events new dotted lines go to it automatically it can emit some new stuff we update our D normalizer to take its events in once we have gone through QA and we know that the service is doing what it's supposed to now those get back into the into the D normalizer DB and they make it to the front end so hopefully if you guys come across some conversation people will say something along the lines of like CQRS have no idea what that means I think it was made up hopefully that myth is busted for you guys now so thanks any questions cool it's tip so the question is is an anti-pattern for the generalizer to publish and yes it is and the reason is you don't want the de normalizer to have any domain logic because once it has domain logic that domain logic should actually be in a service of its own and it should be in one service that knows how to do that one thing well all of these things our principles and not laws so maybe there's some point when you're building your system where you're like you know what like we really need to build this this like monitoring service so that every time an order gets created we actually go and write to this other thing or tell this other service what to do you can always choose to do an intermediary step and like you know what for now we're going to shove this code they do you normalizer it's going to publish it's going to republish something and we know that we're going to mark it as some code that he visited later it's always up to you so these systems you don't have to just build a 100% reference CQRS system you can take these concepts and apply them something you're building and they can still be beneficial even if you're not 100% secure arrests go yep yep it does yeah no it's persisted yeah it's persisted because you want it to be durable itself if the backend services go down and aren't working you still want the system to represent its last known state right and then as up the things come back online those messages that are on cues began getting processed by the different services they begin publishing again things make it back to you normalizer and now your data is coming back to you in real time that's right yeah so the question is where does the where is the the true golden copy of the data and the true golden copy of the data is living in smaller more manageable databases that are paired up with each individual service so one team can be responsible for things for the data regarding orders being created another team is responsible for the data regarding shipping and you end up avoiding having this huge monolith of code where teams are stepping on each other a change to one piece of code causes another team to have to like deal with change control it's totally loosely coupled yep yeah yes it so that's not really how RabbitMQ works you can have a confederation or like a cluster so the question is what happens when I add more rabbitmq servers what happens when I need to scale this what happens when like my business is making money so I want to start making this thing bigger yeah more subscribers how do you satisfy having more services and more throughput through your system so you do have to scale up RabbitMQ again you can choose Kafka you can choose to room queue whatever you want to do for the rapid MQ implementation you can have a cluster and they act in unison and so you actually end up having sort of like a virtual a virtual instance name on RabbitMQ and those define sort of like separate domains through which all the messages throw flow so you can scale up to five RabbitMQ servers and still not get redundant messages sent it manages that for you it'll distribute the connections and the subscriptions across the instances for you yeah yep got it yep that's a question okay okay so many instance of the same service okay right so I answered what you answered about multiple rabbit in queues what happens if you want to scale your service horizontally right so that order service we had like 16 different little green boxes stacking up diagonally what you need to do is figure out how to shard your data so for instance when we built electron Oh fie we built a corporate bond exchange you can't have all of your bonds being processed by one single process it's just like you can't scale a box that big right send me with a stock exchange there are so many different stocks you can't have the the trading engine for a stock exchange all in a single process so you need to scale them so what you end up doing is you have to be creative with the the command and the event names such that you use sort of that naming pattern of order dot create but maybe what you actually do is order dot identifier dot create and so now what you can do is you can use the topic routing capabilities of RabbitMQ to say when a service starts it asks some other service i am number one of sixteen what are my subscriptions and it gets back an array of strings it then loops through those strings subscribes to them all and now sixteen out of five thousand different stocks or bonds or whatever those orders are now routed to that process alright so you see what you're effectively doing is sharding alright and if you have that problem it's probably a good problem to have right people are using your system yep yeah yeah so there's no so the question is how do different services find out about what topics are available what commands exist in the system what what what events can be published there's no central list of these things what happens is rabbitmq doesn't care what you tell it to send it will take the parameters that you give it about how it will send something and it will just do it and so if you use service bus and do bus send there's some code behind the scenes that says hey RabbitMQ assert that this queue exists and then it just says fire off to that thing now if I had a front-end application and it was firing off Cornucopia create and I don't have a cornucopia server yet it'll still tell RabbitMQ to fire off that message and it will just dead-letter it it dies so the actual list it exists in your codebase it's distributed out amongst all of your different processes and if you're using service bus register handlers it's super easy for engineers to figure out how how to find it because you just look in the handlers folder and you look at the names of the file names which you should typically named after the different events or commands that they're listening or subscribing to does it make sense no I've had systems that have had like 16,000 event names yep because of sharding yeah you end up having commands that are named after like literally at electron if I we had so the thing the difference between stocks and bonds is that stocks there's like I don't know a few hundred or a few thousand of them because every company can have one stock for bonds companies can issue as many bonds as they want right so a company could have a hundred different bonds so the sharding problem for building a bond exchange is actually much harder than the Shan sharding problem for building a stock exchange so we had a different command and event identify hers for every single bond that exists yep any other questions so yep yeah the reeds easy it's the exact same long goose call that you would use to read from if you happen to be using if using graph QL it's exact same it doesn't change the only difference is we're just not updating those databases directly from the same process we took that update call and moved it somewhere else in his own little database and then he publishes something and then a similar update call happens in the D normalizer to update the corresponding location that we're going to end up looking in at in the outward-facing application does that make sense it's a bit of a duplication but it's not duplication meaning you're doubling your effort its duplication in that we acknowledge that the way that we want to manipulate our data and a background service might be different than how we display it so although those things may have the same names like maybe there is a order record in the database for the order service and there's also an order record in the de normalized dB the properties on those can look totally different an order record in the order database that might only have one order collection in it there might not be no other collections it could be some big huge nested document with a bunch of information about that order and then the order record in the read-only DB could literally just be the order ID the user ID and then some read information like the person's written name or the product name or a list of product names etc does that make sense so although the names are the same we're saying that the concept of an order and the order service can be different from the concept of an order in the DES normalizer and further since we said that these downstream services listen to events and they can do their own thing there can also be a different concept of an order in the fulfillment service that just has maybe no readable information and only identify errs in it because it's just a machine using that to do some sort of a some business logic that makes sense the reeds not an event that reads the same thing that it would have been so we still read data from the same place using the same mechanisms we're just moving some of our data to a different part of the architecture but you're still just using a mongoose call you're just saying hmong you know mongoose model dot orders dot find in your web application that actually doesn't change you just get the information but you understand that when you update it you're not doing a mongoose models orders dot update instead you do a bus send and we understand that the eventual consistency and this unidirectional flow is going to cause the D normalizer to update the record that I'm expecting to update make sense cool yep we'll take one more and then I if there's extra time I can show a little bit of extra stuff what's up yep yeah yep and typically yeah so the question is okay say you say you do want to have an immediate as immediate as possible response from a back-end service to a front-end service without having to go all the way through the de normalizer is that possible should you do that just like we were saying over here these are principles you can violate them if you feel like it and if you feel like you should and this is one of the times when people sometimes violate the principle and say actually I want to know right away so for instance for some people choose to do authorization with this because you want to know right away you don't to wait for some D normalizer thing to update and then you know if someone's logged in so for that you use RPC and it's basically what you just described so RPC means you make a request and you set up some listener for someone to then publish an event to you right and so for this example we'd have an authorization service and you would send it a login login request command it's going to go and do the bcrypt stuff that you know that auth services do going to determine if this guy that this person actually should be logged in and if so they publish an event and this is a case where actually the front end subscribes and the only additional thing that needs to be and there is a correlation ID because you need to know that a number you're serving so many requests per second there could be a number of different promises when this new thing comes in asynchronously your threat or your your event loop is already gone that's already ticked so you need to be able to look up based on a correlation ID does that promise even exists if it does well now we can go ahead and fulfill that promise and guess what you're online now does that make sense cool so I actually had I was expecting someone to ask the question is it okay to partially do CQRS and or if you have an existing solution is there a way to make it a little bit more CQRS and step your way to it so I know we're getting a little bit like late on time 9 but I can use 10 minutes to show you how I did exactly that to make the NYC know.com site if you guys want to see it you guys wanna see it alright go I read so NYC node so thanks to the work of Pat Scott sitting right here who is the DevOps wizard he showed me how to take all this stuff and use docker and totally docker eyes all these containers so guess what you can also do CQRS and containers and have it totally composable that way doo-doo-doo-doo cool so we're gonna restart docker okay so while that's happening and docker is restarting let's walk through the code so what I have here is sort of a and in the pretty soon timeframe my talk from last year last month is going to be online and you gotta go and watch it and see how I make meta repos of additional repos using this thing called get slave but this is a repository this entire thing and these are actually sub repositories and there's a tool that knows how to keep them all in line together so you can do like get branch on like 50 repositories at once and have like unified code pushes so that's what this is this is my meta repo and I've built up these different micro services using a quasi CQRS pattern to power this node site and the node site is one that I just found and forked online of a Keystone Jas app so I wanted to be able to spin up these different services that did stuff as I added new functionality to our site and I didn't want to have to navigate Keystone json' because I didn't really know it until I built this thing now I feel like I do know it but even then I don't want to have to shoehorn in things like cron processes that are checking for new data on meet up via the meetup API I don't want that to exist inside of a website and so I want them to be these different micro services that live on their own I can scale them on their own I can shut them down if they're broken stuff like that so how did I go about doing that so there's no place in NYC node - site yet where I'm publishing a events or sending commands so I have sort of a a quasi CQRS situation here what I have instead are these apps that that start up and reach out to the outside they all have these schedulers and they're scheduling do do some sort of cron job and there's some sort of a thing that they're doing now this one runs on a schedule and every 15 minutes it checks to see has Matt created a new Meetup and so it checks the meetup API if new data comes back that wasn't there before it publishes an event and says new meetups found and it gives the array of meetups that exist those then go into those are then taken up by the de normalizer which updates the quote scalable read-only horizontally scalable MongoDB which is just the DB that Keystone Jas was using right now what happens is and we'll just walk through this I've got this import meetups there's a couple of steps that it does it fetches them this is it going out and knowing how to go to the meetup API and then it publishes them so when it publishes again we have publish Meetup find right so this is the thing in my domain a meetup this is the verb it was found and we passed the array of meetups and so now what happens is automatically since we use service bus register handlers to register these handlers over here in the D normalizer he has this meetup dot found handler and so he's listening for all those events those are going to be routed to him from rabbitmq he's going to log some stuff just so I can see it in the logs and then it's going to go through and although the shape of what Keystone knows to be a meetup is different than what meetup thought was a meetup cuz meetup had all these extra properties I shaved that down and I saved just what Keystone Jas cares about right and so I'm using Mongoose and the way that the models are done actually look a little bit different the Keystone Jas model that raps Mongoose has bunch extra properties on it this one's just what we need so what happens is the array comes in this thing loops through all of them it does an up search actually it's not an update so I can run this as many times as I want and if my data goes bad for some reason I can come back make my code change run a command that will either do this automatically or I just wait for cron to run it again it's going to go back out to meetup find all the meetups send them back to the denormalize oh did you normalize just going to up sort them all and bam my site now automatically has all of the data in it same thing goes for users and videos found we can see that this is the exact same thing async map it's going through all the different videos it's saving them with all the like the YouTube external ID so that I know where to direct the iframe that allows us to play the videos and if we go back and we look at the YouTube and gesture it's almost the exact same thing so our scheduler is just scheduling this import videos which is going to go and it knows how to fetch things from YouTube it's actually just saying what videos exist in a playlist that I made on an NYC node account and it gets all of them it publishes all of them those go to the de normalizer and bam I've just reconstituted my entire database and that happens on a schedule and so at any point if my server crashes I just get cloned this thing pull it all in I do docker compose build and up and wait five minutes and I have reconstituted the entire NYC node database insight and so there's no place in here where I'm using commands to tell services to do things so it's a quasi CQRS system but as soon as I start deciding that I want some business logic that lives inside of NYC node if there's stuff that we do that we can't just get from YouTube or from meet up if we're collecting comments from from you all if you if we want to add the ability so that people can make blog posts and have them saved then we'll start making some that Keystone gSN's commands to a blog post service or something like that and those things are stored in a way that consolidates the logic for that and one little service makes it back at the D neuralyzer by publishing and then updates the read-only DB that makes sense cool I don't know what's up with docker let's try it again says it restarted boopadoo all right well I'll figure out how to get docker working here in a few minutes if anybody has extra questions what's that yeah so up what it's coming up yeah wait one second here we go so we're going to hop over here to our website so this is service bus again this is the thing that we've been talking about it's a service bus register handlers service bus retry is the thing that we said will automatically change our services behavior to retry three times then move messages to an error queue service bus trace was that thing that allows us to use the correlation IDs and see real time tracing of our systems and of course we're about to see how we did this over here locally one second so things are bouncing a little bit okay cool so what happened there as things were bouncing because RabbitMQ was coming online so services we're trying to start service bus was throwing an error because rabbit and queue wasn't on line but eventually the system healed itself right so now we've got Redis for caching and RabbitMQ all set up and we've got a number of different services that are running so if I come over here and I look do-do-do-do-do now we have a working site hang on one second hey Pat what's the command to tell it to get rid of its data that's it cool okay so this should load back up pretty quickly and we should have a totally blank site and then what we're going to do is I'm going to send a couple of commands to the docker containers and you're going to see that within a couple of seconds of it coming back online we have a site that has all the same data as the currently completely live NYC node site so building the normalizer any other questions while we're waiting for this to happen sure mqp yeah service bus doesn't have a pluggable capability for that but there's no reason you have to use service to us to do CQRS although I did actually go ahead and do have to work yeah yeah oh sorry right what what I'm also saying is that right now the service bus is tightly coupled to RabbitMQ however I've done half the work to make that decoupling so that you could use other providers so that that might be coming soon okay so we are back up and currently cycling waiting for RabbitMQ and now it's actually doing a bunch of stuff so I need to try and refresh this fast okay cool so we just deployed a new site go add some data it's actually adding the data right now so it already went and ingested a bunch of C actually that's just building indexes okay so right now there's nothing right we don't even have the top of the site the site doesn't know how to function because there's no starter data in it we need to add a meet-up and we need to add some videos and stuff so what we can do over here up it's actually started it looks like it's pulling in all the users right now so in a couple of seconds you're going to have a user on my local machine so first I think we only have one meetup in here right now because it just got the current ones let's backfill it so we'll go and say backfill so we can see that now it has so the meet up and gesture didn't log it but the de normalizer has to normalize 67 meetups so BAM now we have all of our meetups we just reconstituted all of our data and we should be able to do the same thing with the YouTube and gesture so there we go what happened to your some sort of a strange oh I think there probably I needed to specify an environment variable but yeah so I need to figure out I need to add an environment variable to the the video and gesture but it would have done the same thing and gone and grabbed videos and basically made it look like this but so that's how you can take an existing project this Keystone Jas project was my thing and use parts of CQRS to make a system that's more easily manageable and expandable because all the business logic ends up living in these micro services and it's up to you to pair up which micro-services react off of what things just based off the events you're publishing and then of course the de normalizer updates your read-only UI that's it you you
Info
Channel: The Nodejs Meetup
Views: 17,470
Rating: 4.9702234 out of 5
Keywords: nycnode.com, nycnode, node.js, nodejs, node, javascript, programming, microservices, eventually consistent, micro, services, micro-services
Id: 4k7bLtqXb8c
Channel Id: undefined
Length: 71min 57sec (4317 seconds)
Published: Tue Aug 23 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.