7 Reasons why your microservices should use Event Sourcing & CQRS - Hugh McKee

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good morning everyone my name is Hugh McKee I'm a developer advocate at light Bend and this talk is kind of long title but as seven reasons why your micro services should use events sourcing and CQRS so in the talk are going to be really focusing on the on the data side and I it was hard to pick several reasons and and I just picked the number seven because I thought it might be an interesting title but there's way more reasons than seven for for using this or maybe not using event sourcing secrets but I thought I'd try and pick some ones that weren't the ones that I saw the time in literature and documentation or things that I you know blogs and stuff that you read about event sourcing and seeker s so hopefully you'll find these seven reasons interesting the but most talks when they start about startup about going to micro services we've been talking about micro services for years now but it's usually we're talking about taking a large and complex system of monolithic applications and replacing it with the large and complex systems of finer grain micro services so the idea is that we're trying to decompose systems into smaller components and often the focus is on the code but not necessarily a lot of focus on the data so in this talk we're going to be focusing a lot on the data because you know there's a processing part of our systems but there's also the persistence part of our systems and the I think there's a lot of interesting characteristics of the persistence part of the things that are people are doing with micro services related to not just breaking up the code but breaking up the data as well because and that's what it is I mean what we're talking about is basically just splitting things apart right so when we do micro services we're you know we you you often hear of taking a big old monolith and splitting it apart into smaller micro services another term i've been hearing lately is just it's a different form of modularization you know you know you have say a could be more manageable as this the argument if it was better modularized well in the case of going to micro services you're kind of heaven forced module a module ization where you're you're breaking things apart physically into different deployable cold units but the same thing can go with your data so part one of the goals of going with a micro service system and there's a lot of goals but one of them is that you want to build a system of loosely coupled services and the reason for loosely coupled is that we want to be able to do things faster so if you know the traditional thing is if you have a monolith you have some tabloid cycle and you got to get agreement from everybody right everything's ready we're good we've done all our testing let's deploy it to production it's kind of a you know it's a bigger effort because maybe there's multiple teams involved in doing it the idea is if when you're going to micro services that you have services that are themselves are independently deployable so when I talk to people and you know that's fun you talk to people it like an organization or in conferences or whatever and and you may be the topic of car micro services come up and you ask all right well what kind of micro services are due and I kind of use these three questions here as characteristics these aren't rules or anything like that but it's just I'm just trying to get a feel for what kind of flavor of micro service are you doing so this is kind of the easy one that's independently deployable you know so you're broken up the code but the next one is a little bit harder and this is where the wheels often come off the applecart where you say the micro service owns its own schema so the idea here is that say if you and I are on a team and we're responsible for a micro service we also own the responsibility for the for the schema itself often what happens and there's nothing wrong with this but often what happens is that you have a legacy database and there's no way you're going to get permission to break up that database you might get permission to break up the code but often is harder that's a much harder battle to break up the database if that's the case they'll be it but you're making a compromise you know you have a level of coupling at the database level you know so if you're really want loosely coupled sir is the goal is is trying to get things loosely coupled not only to code but at the data level and then finally kind of a third question that I asked is the only way that anybody gets to see your data is if they go through your API so there's no backdooring you know so say you have a micro service you have your own data and nobody knows what that data looks like except through your API and the reason for this again is loosely coupling because if you need to make changes to your data it's your micro service should be a black box as much as possible that's loosely coupling which gives you the freedom to make changes inside the micro service without having to get permission from everybody else to get agreement when you have to go into meetings and talk to people and get agreement and everything then that's friction just slowing you down it's it's it's not giving you that velocity of trying to do things quickly so it's just you know again these are these aren't rules it's just kind of you know get an idea of what kind of micro service you have the goal though is try and get it loosely coupled as possible but often when depends on are you talking to but when you start talking about splitting things apart it's you know it's like it's like one thing if you talk to somebody say ok I'm going to split the bar to code and I go ok ok I can deal with that but you say I'm gonna split the part two databases like you know it's like what are you nuts it's like you know are you are you crazy that this is this is not going to happen you know we have DBA teams we have formal processes blah blah blah blah blah and but you're missing out I think and and this is why I wanted to kind of do this talk to just kind of show you some of the advantages of using some of these alternate approaches so for us on talking gonna kind of walk through with the 7 reasons but the first one I wanted to do is just show you what is event sourcing and seeker s so here's a diagram that I like to just walk through real quick to kind of give you an idea if you haven't seen this before I know the first time I heard a talk and it was a life and talk and this was like three years ago or something like that and that it was a webinar and the speakers said seek us and it's like all of a sudden I went oh man their acronym what the heck is seeker s you know so it's command query responsibility segregation and it's a kind of a fancy term but basically it's splitting the reads reading the writing and the reading apart so this is the kind this is the basic flow you have a service and say a client sends in a request to this service and say this request is to do some kind of change you know you want to change some data it's some kind of state change and the for that kind of request is it's thought of as a command and a command is an intent to do something like add an item to an order you know add the shipping address add the building address you know here's a deposit for my bank account that's an intent it hasn't happened yet but it's a request so these things are handled on kind of what's called the right side because on the right side this is where we're going to make changes to the state of say an entity of some kind so this command comes in and it's validated it you know does a is a command okay does it violate any business rules you know whatever it takes to make sure that this is a command that's good to be done by this service then what submitted is one or more events and an event is a historical fact you know it's usually time-stamped this we added this item to this shopping cart at this time done right the interesting thing starts so is how it's persisted so the event itself is persisted we're not updating something we're not say changing you know modifying the order in saying in case we're doing an order we're just recording the fact that we added an item to the order add another item to do order adjust the quantity each one of these things is an event that's going into this event store so this event store is a real simple data structure is like a key value you know it's like entity ID timestamp sequence number or something like that but real real simple data structure the idea is that to get the current state of the entity it's just a matter of running through all the events to get you know all the all the data running through each event gives you the current state of the order but it's also the the event store is you don't change it you're all we're doing is inserting so say you remove an item from the cart so there's an event I remove the item from the cart that gets logged that gets stored in the database what's kind of interesting here just real quick is that I've heard the term it's negative data meaning that traditionally when we did these things if we say removed an item from a cart the current state of the cart doesn't show that item in the cart anymore but it's not you know the date is gone we you know we updated the state of the cart and it's gone well we're moving into this new era of data science right and data Sciences love this kind of stuff where they can analyze why didn't disorder item get purchase why do people keep putting it into the cart and taking it back out that's you know data now that we were capturing so next what happens though is that these events they're persistent it's like typically you have some kind of events or casandra's a real good one good database for this for example doesn't have to be Cassandra but it's a it's a good example because it's just a key value but then the events somehow need to go over to what's called the read side and the read side is a is a data store that's set up for querying so often this is a data store that is not in you know in its fully normalized it's more that the data is being stored for query all right so one side is really just for capturing events the right side the other side is for query so we'll be going through this a lot so when the cig want to do a query we want to get the current state in the order for example what's you know or maybe save the question is what's what orders has this customer place well those kinds of queries are queries that this service should provide is part of the API of the service so those queries are done on the read side so the quest comes in to do a query is kind of a read-only command that command triggers some kind of a query we do the query against the this other data store which can be a relational database it could be elastic search it could be both you you know I think as you'll see you have a lot of flexibility here in the way actually store the data for querying because of the way that all this stuff works in the mechanics of it but basically you're getting back you know whatever results in it so it's kind of like when you defining the microservice you're also defining what query should this micro service provide and then design the database to support those kinds of queries the read side database so there's this right side which again is it's just like a really fast simple key value store and then the read side which is go nuts you know whatever is the best data store for doing the queries that this service needs to provide so moving on that was a quick overview but moving on kind of from the design phase there's with domain driven design if you haven't heard of it before it's one of a very common way for people to identify what micro services should I build you know it's a design process we're kind of one of the outcomes is a list of your micro services I'm the domain driven design person I'm I'm a developer so I'm kind of usually on the receiving end of these design efforts but that's the idea another one though that's really interesting is events storming an event storming kind of compliments to manger of a design but the idea with the event storming is said you get the the dev team involved you get the business people involved you get everybody and put them in a room for a day or two or three or whatever and all they think about are what are the events that flow through the system and they're usually doing it on some kind of sticky you know sticky notes on the board and the notes kind of gravitate towards each other where it's like these are all the events say related to order here's all the events related to shipping here's all the events related to catalog you know whatever you know whatever kind of flow is in the system the nice thing is that the outcome of this exercise should produce like a good idea what micro services are really there and you know what what opportunities for micro services are there in the system as well as a pretty good idea of what are the commands and events that flew through the social system which I think leads very very nicely into an implementation that's using event sourcing to see caresse because when you're implementing services using event sourcing you see caress your heads kind of in the event game anyway so it's kind of coming out of design to building that the services with with this whole kind of event mindset so another reason reduce service coupling and here it's the idea is that say we have three services and I'm going to go run into a couple scenarios so say service one needs to retrieve some information from service two and you know so it's like okay well we'll just do an HTTP rest right do a get pull simmer for information from service to get it back everything's good real easy to implement but the thing is is like what happens when service two goes down well this is a form of coupling when service two goes down that might effectively take service one down so the blast area of the one service going out could drag down other services which isn't a great thing and it's kind of a sign of coupling and again it's like it's really easy to implement just the simple no you know synchronous get and retrieve and everything works great but it works great when everything is working but it doesn't work very well when things start to break and you know you it's a matter of how do you like your life to be when you're in production do you like your life to be nice and simple and no problems or do you like excitement and outages you know in downtime I don't like excitement and outages in downtime it is even more fun saying when the services are kind of doing things collaborative collaboratively so for example say service one is an order service and service three is like a customer credit service and we're gonna walk a little through a little scenario here so see you know events are coming in a you know for an order add item and I you know shipping address billing address credit and information whatever it and find it the the user says bam I want to submit the order so that submit order event comes in well when that happens the order goes into like a new state so we have a new an order sitting there and his current state is a new order right well the customers credit service wants to know about that so somehow we need a message from service one two service three that hey we got a new order check the credit for this customer so some message goes over the three three does a credit check on they on the customer for this order and it could do a state change so now we have a second state change and a separate service and then everything's cool so this the the customer credit check is going to either reply with yet border approved or to reject it right and that message goes back to or response goes back to service one and a service one changes the state to order approved or order rejected so this is interesting flow that's going on here where we have basically three transactions that have occurred three independent at happening at different times database transactions and our go told monolith this was probably a single transaction you know life was good there but life was comfortable everything was fine just one transaction boom we're done but now we you know somebody got this bright idea to split the whole system apart and now we're doing stuff all over the place and how do you keep all this saying well this is where things get interesting because the question is and there's some people here to who know this it's not if something breaks is when something breaks just you just gotta face it that you have to think about your system not in the happy path but you've implemented it it's like when things breaking you know in the middle of the night how are your service is going to ride that out with the least amount of pain and sorrow for the people that are responsible for production support and the customers in the business say where can things go wrong well lots of things can go wrong here the network could break between service 1 and service 2 so service 1 now can't tell service I'm sorry service 3 that I've got new orders service to make it even worse service we could have say have received some messages hey check the credit on these customers but before it could do it boom it goes down right so you got how do you recover from that even more interesting say service 3 did do the check it changed the state of the customer is trying to tell service 1 back what happened and the network was done and just to make it even more interesting service one's got the responses back but before could act on them it goes down so things are gonna break anywhere here and you got to fill all these cracks all these you know these little holes where you're gonna lose messages because the last thing we want is customers calling up and saying I got this order that I place like five days ago and it's sitting there in a new state why is this happening and then you guys go in and start looking at the system and you're diggin it's like what the heck happened it's like and if when you finally realize where your problem is it's like you might realize oh wow we got we got some leaky messages and how are we going to plug this hole is it could be pretty pretty nasty so the idea is that and this is on the inside out in this first diagram you're going to see this diagram a lot for the rest of the talk we've got the right side and that's a transaction you know we're writing events into the right side database and on the reach side the idea is one approach is that you're pulling there's a there's a some kind of a reader that's pulling messages from the right side and putting them into the read side our intuition says why don't we just push it and that's a viable approach and we'll look at that a little bit later but the reads this pull approaches you typically clean or damp lament and as robust as any kind of push approach so this is on the inside of microcircuits just the mechanics of a micro service handling events coming in to the right side and getting him over to read site but same can apply the same basic pattern can apply between services that service 3 as of as a nor don't see a new order goes in and it's in a new order state service 3 is looking at the those events and is pulling them over and when it sees a new order event it knows oh I've got a order that I need to check the credit I'm the beauty of this is that service 1 is completely oblivious to who's consuming data from it you know it's a it's a producer but it's not actively pushing data out to other things it's just a producer that says yeah if you want my data will use my API or whatever and you know to get get my data and do it whatever you want with it the same thing can be true for the customer service sending messages back to the order service the flow could be reversed that service one here could be the customer service service 3 could be the order service you know when the credits were approved or the credits rejected the order service could be looking for events happening in the customer service and pulling those back so you've really kind of broken things apart now this also applies like you know a lot of people say well yeah well we but we're using Kafka you know Kafka's our message bus and everything's great and it's like well guess what the pattern in Kafka is just the same thing the consumer is doing this the consumer is pulling from Kafka and it's doing it by this offset approach right and I'll show you another look at this you know like how do you get data from the producer into Kafka as well without losing anything but it's the fundamental approach that works here is this kind of pull approach so even with Kafka you're not completely off the hook for worrying about losing any data we'll look at this a little bit more in a bit so the point here is that we're looking at kind of the asynchronous approach which is what I just walk you through one a synchronous type of an approach versus the typical let's do everything HTTP REST type of synchronous approach right and again it's like whenever you're looking at it whatever approach part of these the process should be don't just consider the happy path consider you know throw rocks at it I like you know likes you know the fun part I think is we're all together as a team and we're thinking about our service and our service interacting with other services and somebody says well what happens when this breaks or what happens when that breaks what are gonna do with these things great you're constantly challenging yourself every single step of the way this is where I really like this pattern because kind of what emerges from that kind of conversation is that right away you look at the you know like a synchronous type of approach and go oh man there's no way to recover from this we're going to lose messages is losing messages acceptable if it's not you better come up with an alternate approach I mean sometimes it's okay to lose messages that's fine but often like in the Sooner aeration I showed you earlier with the order and customer you can't afford to lose any message is not one not a single one if you I've been in that situation in production where millions and millions of messages go through and every once in a while you lose one and you're a jerk and the business hates you and your managers are on your on your case and it's like you got to get this fix and you're going oh my god how am I going to fix this thing that's why when it came across something like this it's like okay here's a solution that I know I can make work so moving on another advantage is break the read versus write performance bottleneck so what I mean by that you know the typically and with as you know the single database we're always making trade-offs right where if you optimize for reads it comes at it at the expense of writes if you optimize for writes it comes with at the expense of reads it's kind of like the the index game how many in two indexes can add to this database before it hurts too much right the this is where another advantages of this segregation is where we split things apart the right side is kind of really optimized for writes its just insert insert insert insert there's no updates there's no deletes it's just inserting stuff and like like I said earlier you're kind of leaving you know a delete is like leaving behind negative data you know an event I said I remove this item from the shopping cart it's like doing a journal counting journal you know accountants don't use erasers right they they have to enter compensating information into the journal you can never update you you know you're always just adding to the log on the other hand the read side is really optimized for query so it's not obsessively normalized if you don't have to it's right just like you might even have redundant data where say you have the same data in two different technical sets of tables because they're there therefore to set two different types of queries all right which is a big no-no in our traditional world of doing things like you know a fully normalized database and then is but it does come at a cost because what's happening is that as events are being written into the right side the read side is trying to keep up and they're coming gaps out here so that's what I'm trying to show on this this picture here is that the right sides a little head of the read side right and because the the right side is not waiting for the read side to catch up it's just slamming events into the right side so the the cost here is that event there's an eventual consistency relationship between the right side in the read side so your instincts are going okay eventual consistency that's I don't like that all right that's kind of bad but it's I I'm almost kind of implore you don't give up on this too quickly push yourself you know think about you know can you know I can I know they're scenarios where say eventual consistency isn't going to work but it's often necessity is the mother of invention push yourself to really think do you can you really come up with a solution where this will work because the payoff is what I'm trying to show you things like higher performance better less coupling better messaging things like that so a lot of people go away you know we can't do it the eventual consistently thing but just give it a shot push it as hard as you can before you give up because another advantage is what I wanted to show is another reason is about elevate the concurrency barrier and what I mean by this is that you know the the traffic that our systems take you know I know it varies all over the board but you know it's like some systems I worked on in the past you know you had the morning peak in the afternoon peak and and things were kind of busy during the week and then weekends were kind of slow so you know you had varying traffic you know say by the days of the week maybe the weekends real busy but the weeks on a slow or the vert reverse whatever the fun ones are these seasonal spikes you know the infamous seasonal spikes like Black Friday Cyber Monday singles day in China which you know it's beyond anything in Black Friday or Cyber Monday where the number of trans the online purchases happened in China I guess on singles day what I which I think is somewhere in February is way beyond what's what's happening everywhere else in the world but the fun part you know I mean and you often read about in the in the in the news about different online sites that cratered because they had these spikes and the system just couldn't take it and I feel you know you feel the pain for the people that are responsible for those systems because can you imagine being on the team of the system that crashed on a big shopping day and you couldn't take all those orders and your your businesses and manager no I don't even want to be there so the the challenge though is that often the the ceiling you know how far can you go before you run on a gas it's not always this but often in my experience to spend the database you can only push the database so hard and as pushes back so like you get the daily load and you kind of go into this yellow territory and things start to slow down and when and what I'm trying to show when you get into red Tauri territory the database really gets mad at you and it it really slows things down so if your system is kind of dancing with you know poking at the ceiling it's not a great place to be but it's also not necessarily easy problem to solve with the traditional way we've been doing things with databases and so on any spikes of the real fun ones and that this graph I'm trying to show this is that you get into the yellow zone and you kind of plateau on the throughput of your database you get in the red Onis and this is where it starts to nosedive for your database response times get really mad at you used to be five milliseconds or three milliseconds to do an insert and now it's taken 150 milliseconds to do an insert and it's like and now now you're in a world herd so this is where that right side read side you get a big spike in traffic the right side goes no problem you know because all I'm doing is inserts this simple key value no it no no multi table updates no you know big transactions just slamming data into the right side and the read side just as fast as it can is struggling to keep up but the balancing game between trying to keep the read side is close to the right side you've had more options like maybe you have a lot more readers reading from the reads the the read side readers reading from the right side there's all kinds of different strategies that you can use but again you have to be aware that when you go with this approach you have the eventual consistency to consider but if it works for you then that consistency ceiling goes way of the concurrency sitting goes way up you can push a lot through more through your system without running into that like that database brick brick ceiling so this is where things back into messaging simplify and harden messaging with messaging when you're building a distributed system and you have messaging going on between services you should think of every single message as it goes between the services and you should categorize it in one of these three ways at most once at least once or exactly once everybody wants exactly once it's like I want every message to go exactly once good luck right it's not easy to do but the at most wants is the the the the easier wants to implement and one is most often implemented but though I like to kind of think of this is at least one says maybe once meaning that most messages will make it some will not right that's a fact of life some messages won't make it so if you go with it and at least at most once approach you're saying if I lose some messages that's okay and there are plenty of scenarios where that's just fine okay the edie mode or at least once is I like to think of it as once or more that because it's like you're going to get every message guaranteed you're not going to lose any messages but the cost is you might get the same message more than once so the receiver needs to be able to deal with that oh I saw this message I can I can ignore it it's going to happen so you're again the receiver code has to test the deal with that the exactly ones is is essentially I kind of think of it as essentially what's that there's tricks that you can do to make it look like a messages exactly once but it's difficult and you know things like if you're doing that offset pull like the you know like which is kind of the Kafka model of pulling messages from you know from the topic the idea is there you can get a form of exactly once in a couple different ways one is that you're storing the offset in the same transaction that you're storing the change like that the message made okay but if you so the unit weight effectively once that you know the message only will get processed one time because they're both offset and data stored in the same transaction another one is that you there's you have some kind of intelligent filtering 2d doop messages if it sees them it knows how to look at each message if it saw before it goes oh I saw this message I can ignore it so you kind of have essentially once so I mentioned earlier that we'll look at kind of a push approach and our again our intuition kind of pushes us in this direction where the we're you know I have a service it produced some data well by golly it's my responsibility to make sure that data gets out to whoever who needs it all right so but the challenge here is that you can't just post it somewhere and say all right that's probably gonna make it as you're right it's probably maybe gonna make it it might and it's going to be to have situations where it's not going to make it because if you don't do anything to make sure that it makes it then you're going to lose messages so this you know the idea is you have some kind of retry logic but but you the rocks at this as well the retried logic has to survive that service going down when it has messages that haven't been delivered yet so the litmus test would be alright if we set up our retry logic it just failed when we come back up are all the messages that need to be sent are they going to get sent if the answer is yes then you got a good approach you're not gonna lose any messages but when you really closely look at it you might find that you you the hunt is to find the holes are you going to lose messages this is the one that scares me the most because you know people get Kafka it's like okay great maybe once you get that data into Kafka you're good the data is going to get there because guess what the consumers are pulling from it but the situation is is the producer the producers say you know it's a service maybe it stores some data in a database commits that transaction and then makes a call to Kafka everything's cool all right throw rock at that what's wrong with that picture and the gap there the leak is that the transaction happened but before you could call Kafka boom you go down and it could be Kafka is not down it's just a networks down between your service and Kafka right and what do you do you you're you've lost the message alright and it's like and it's what the reason it scares me it's like you go into production everything is working and every but all the sudden you start hearing from customer complaints and business complaints ain't you this is this problem is happening we got orders they're stuck in a new state they're not moving it's like and then you start looking and say well what you know what's wrong you know and you dig and doing a dig and ultimately you're going to find that you're doing this that you have the database transaction and then you're making a separate call to Kafka and if you had those failure is where you've lost some messages that's a really subtle thing and it scares me the solution this is why even trying to get data from say from your service into Kafka if it's you've persisted that the the event or you even if you've updated your database you know say you have a server it's not using event sourcing and seeker s you still have to have some approach that makes sure that you're getting that message into Kafka and it can retry if it fails right this is why again I like this this reader type of pool thing there's a reader pulling things Pollini events from the event log and pushing min to Kafka and you and you get the at least once pull approach make so every you know you you will not lose any messages getting them into Kafka and what's Kafka has it then you've got the at least one to pull all the way down through the whole flow and it doesn't matter who's consuming and and like I'm sure here it's the it's important to think about this and just trying to make that you know that simple call the copy itself and then this almost there but the this one here is eliminate service coupling this is one I liked the most so here's a scenario say we have a bunch of services and I put customer service in the middle so say all these other services in order for them to do their work they need some customer information so whenever they you know they're they're performing some action they just do a get to retrieve some information from customer dan from you know the response goes back and everything's fine right piece of cake real simple well this is again what will go wrong right and what happens when customer goes down customer goes down the blast area isn't just customer it's all these services that are coupled to it right so you get this you know big kind of blast area in your system where a bunch of services have collapsed because one service is misbehaving and this is another form of coupling all right well guess what what how can we fix this well a possible solution here bear with me because this is gets a little crazy but customer is just creating events writing it you know just doing event sourcing and seacrest like any other service these other services are consuming from customer and this is the heresy those services are storing their own view of customer in their own internal database in their service okay the so part of our you know our natural response to this is like well oh my god we're first we're replicating data we're duplicating data all these are dirty words right but it's like no they're not anymore because storage you know is getting really really cheap and you know what do you want you want a distributed system that works that can take a hit and the system keeps running or you want an economical distributed system that is you've trade made trade elsewhere it's more brittle if you don't want those brittle services where one service takes down a bunch of services then this is an approach because now in this scenario when customer goes down the other services aren't even aware of it because they don't care they can keep doing their thing you have requests are coming in they've got their own view of customer that you know works for them and they just keep right on conducting to business when customer comes back up we just resume pulling data from customer I've this is relatively new but I just ran into somebody in fact it was last week I works at a company that they had or like two or three hundred micro services we've quite a bit but they've kind of gone all-in on micro services they've been doing it for for a number of years now this is the basic pattern that they've established is like this is the rule for everybody implementing micro services in the company this kind of basic pattern where if you need data don't depend on somebody else make yourself is independent and loosely coupled as possible every single service so this is I was delighted because this is like one of the first people that have talked about or talked to that had done it on on this scale where they just really kind of just over time this was done through battle scars and and experience right that they kind of evolved into this pattern but again it's like what are you nuts we're replicating data things like that but the key thing here is that because of the messaging you where you've got like at least once messaging approach that your every message is guaranteed to make it eventually it opens up these options for doing stuff like this and the end result is that you've got systems where they can compensate for you know one service goes down the rest of the system goes I don't care we're still running the customers don't even know what the business doesn't even know what them yeah and it now it increases the flexibility of being an say a little bit more adventurous maybe you make a release of a new service like say you and I are on the team for customer re-release customer and customer breaks because some change remade broke in production which is like oh my god you know that's you don't want all things to happen but now the fear is kind of reduced because I mean you don't want this to happen but if it happens you're not taking down the whole system and the other way where the services are more tightly coupled with each other if we're you know say you and I release customer and our customer service goes down it takes a whole bunch of other services down with us that's not a good place to be you know so you can become a little bit more fearless with this kind of a system so the final one is graduate from the IT nursery this one it's a little hint direct from what would be in talking about but the I I I spent a lot of time in a large enterprise NIT and one of the things that drove me crazy was governance and the reason being you know it's like I wanted to I wanted to change the table I want a database I'm on a server I want a topic and you know in a Kafka or something like that and you got to go beg for it you know you can't have it until you fill out this form okay fill out the form yeah answer all the questions all right well let you know tick tock tick tock you know hours go by days goes by and then eventually you hear and it's like oh yeah no you can't have this because Bob about ready you we need to talk about it okay let's talk about why do you really need this block you know so on and on and on in the meantime you know I'm looking at it it's like I've been in meetings where there's been 20 people in the meeting talking about doing something where it's like you know it's it's like when I look at these kind of approaches it's like wait a minute if it's my service and I own the data and it's my responsibility to make sure the service works in production and in addition to that if my service goes down the other services are tightly coupled to me why do I need all this you know nursery guidance why don't we you know why do I have to beg for permission to do things the result is said all that friction all the things that slow me down in the past should go away I think so you reduce the governance this is horrifying to enterprises but it's something I think you got to get over because the argument is if you reduce the order you introduce chaos right but my response is like this is organized chaos right if I again if I break my service you would say you and I are working on a service and we break it it's our responsibility to fix it but again we're loosely coupled as much as we've tried to make it as Lucy couple as possible what's the damage right I'll fix them you know we can fix this really quick because we own it we not to ask anybody to change the table or we you know whatever we made the mistake on we just fix it and put it out and you know put it back into production so way it's like Siri it's like an organized result revolt in a way that the we you know organization simply can't afford to have all this friction anymore they you know I mean I know the governance was put in place not to be mean or nasty was to try and keep us out of trouble but the way I like to look at this is like 180 degree change we still keep out of trouble because we're following these principles of say building services are loosely coupled where you're on your own data all of these types with all these things that I've been talking about and if as a result you should have a really streamlined fast efficient organization with not a lot of ceremony and ritual to get things done so finally I as always do preparing this talk I saw this quote right people are very open minded about new things as long as they're exactly like the old ones and we just can't afford to be this way anymore all right the yeah I've had when I I'm as you can tell I'm kind of a big fan of events or seeing a Seacrest and even in Lipa I've had a lot of interesting discussions about it like well you know wait a minute dude you got a you got a calm that down because it's like it doesn't fit anywhere oh yeah I know it doesn't fit anywhere but it you know you want to make an informed educated decision about where it fits and where it doesn't fit right not just dismisses like oh this is too radical but the challenge is that when you like I said in the beginning it's like it showed you know the split where the lava starts boiling up when you start talking about event sourcing and seeker s and micro services and micro services at on their own data and on their own schema and people start looking at you like you just you know you've lost your mind because it's totally different than what we've been doing before so they hear about micro services at first and it's like well that that really sounds good but then you start digging into it wait a minute this is way too different I didn't you know that I didn't sign up for all these changes like look you guys you've you've got to get familiar with the these types of new things you just can't afford to just sit back and it even like I mentioned with the eventual consistency that's another hard one to take but again it's like don't give up on it right away push it as hard as you can until it hurts and if it doesn't fit then back off but more often than not if you people are pushing these things and the reward is that like this one company I was talking to this is their pattern they push themselves as you know they kept pushing themselves and now they have this pattern of this of a system that can evolve for the business at a really really high rate all right the business can say I want this new feature or something and now you can that's the ultimate goal is to be able to let the business go as quickly as possible without getting yourself all you know screwed up and trying to roll things out and for so that's basically it the the seven reasons you know the smooth you know kind of from the main driven design I I hit on service coupling you know messaging between services you know kind of multiple times like so that was kind of one thing with the talk and the other one was the the impact on on the performance and the dynamics of the data itself you know like the the the readwrite performance the concurrency and so on things like that and then the final one was the kind of call to rebellion on governance a little bit you know I push if you're in an IT organization if you're not you know it's great but if you know if you're in a big IT organization one of the things that I think you can start to chip away with when you start to move these kinds of technologies are those processes that just slow you down and don't keep you out of trouble in this kind of new environment and that's all I got it we got plenty of time for some questions if you first of all thank you great talk I have a couple question from me one when you talk about eventual consistency I wonder if you have patterns for basic levels of consistency for example read your own rights in the canoe example when they remove an item from shopping cart I think would be terrible to me from you hit f5 and see it again in my shopping cart back and second question is you didn't mention that letter messages or events I'm sure you've dealt with them I'm curious what your thoughts are so yeah with the with the eventual consistency part I mean one thing is that you can ask the entity for its state that's a simple query that the right side can give you like what's the current state of my entity all right so like what is my shopping cart look like right now the the authoritative source of the current state the entity is always going to be the right site so the so if you need that what is it right now you know and you're not doing any kind of fancy query like what are the orders that this customers done over time then you have that option so you don't always have to query the read side to get the the state of the entity but it is limited and the queries that you can do against it's really just like what's the current state of Indiana t the and then again the read side is really all you always have to be aware of the fact that you're you might not be seeing exactly the current state and so you have to take that into consideration when you're doing inquiries I've one thing I did hear about though is that one company mentioned that they did something where they still wanted to do the query but the query actually triggered making sure that any pending changes for that entity that we're still on the right side that hadn't been seen on the read side they would pull them over in the course of the query I don't know exactly how they did it but that was the technique that I heard them mention was they that kind of caused a flush and it could be done based on the time stamp or version yeah something like that at least this version it sounded a little complicated but it was like okay they made it work right so that was kind of pushing the eventual consistency I thought as an example there and on the dead-letter did this is where a message was attended to could be sent somewhere but the the recipients no longer there the recipients are consistently failing to process there is the bar cannot deserializing message so there is no like a Poisson message right yeah that's a really good question and the it's like at all costs and that's something you really I think you really have to test for that that you your receiver can never get into that blowing up when it gets a message that there's got to be do you like the ultimate catch you know try catch type of thing where a really bad message comes in and it can't take the service down it I I we've had this discussions in in various times in the past where that's been a concern because this happened in the past I consider that to be a bug in the system and I know it's it's like okay yeah it's fine you call the plug but we're in production and we're down right but it's like you got to really think about that when you're setting up the service you know like in your testing you should trying to think of every nasty thing that you can do to the receiver what what's the worst possible message we could send to the receiver make sure that the receiver can handle it right and deal with it appropriately now again it's you've got a message that you can't understand the receivers gotten a message that it can't understand so in a way that's that's some kind of a bug the even there though the challenge is that what do you do with it now right so some people will just throw it into an era queue so you throw it into an era queue or an era what error file whatever error table and nobody ever looks to them right so you got a it's it's a difficult problem they ultimately the goal is to try and keep things as simple as possible I think I know that's not not always achievable as well but first office always be you know be resilient in terms of the consumer can take pretty much any garbage in and still be able to say I don't understand messages but I and this is what I'm going to do with it log it throw it into some kind of a queue or something like that and move on you know because you don't want it to stop just because it got a poison message thank you Thanks let's suppose your right side is a Kafka topic what about retention policy is the canonical approach to keep messages in perpetuity or what and how do you think about replay the that's a real good question the because typically Kafka has a limited life span where message to stay in like a week or two weeks or something like that but the when people talk about a meant sourcing and seeker s they often say you know the idea is that you never delete events and then of course the pushback is of wow you know our data is going to explode and it could but it really depends like if you have say a bank accounts a bank account has a history that goes on forever right so you could have hundreds and you know potentially thousands of events say to give you what the current balances on the other hand say you have an order and an order has a relatively finite number of events before it's done and then never be changed again so it really depends on the kind of data you know and and retention as well I'm not a more of a I'm not a purist I think I'd rather take the approach when you look at each case and if it's okay to only keep the data for a certain amount of time and it's gone then fine get rid of it yet let because kisana could do the same thing could age out old stuff after a certain amount of time right so you don't just keep accumulating garbage often the thing that drove me kind of one of the things that drove me to this talk was the whole replay thing oh yeah when you're using event logs you can recover blah blah blah because you've got all the events and just replay all the events and off you go it's like okay big deal right now it is nice to say you design your read side you design your query and saying you go with a relational database first and that's your read side query database but then then say somebody comes along and says oh you know we really like to be able to do this search where we can do more of like a elastic search kind of a query right so you'd like to add an elastic search database alongside of the relational database so you know some of the queries come from the relational database some of the queries come from the elastic search database if you got all your history that's what the advantage comes is because you can take all that history and load it into the elastic search database and naive you know as a new feature of your micro service you have these new queries that are available and again it's all black box on the outside you just change your api's here's some new queries that you can do that are really cool because you can do these kind of Google like searches right on the inside you've made this looking a fairly major database you know persistence to change if you're throwing away the data that's not a possibility it's a trade-off right if you if you don't need think you're going to have to do that and don't worry about it but I would say though that the natural reaction is oh my god where they keep all the history isn't that kind of explode in the amount of storage it's like well think about it a bit you like the orders I've talked to people I've asked them it's like has the volume of your data gone up when you've use events or seeing you seekers no because you know we have orders right there's a finite number of events so the actual storage for say an order in a relational database in the traditional way versus a bunch of events and and the read side as well it may be more but it's not like excessively more Thanks thanks for the talk I was wondering when it comes to that trade-off between availability and consistency is that the decision that's usually made by each microservice individually or you need you know those rituals and ceremonies to take those kind of decision it definitely is service by service yeah the eventual consistency decision is is it will it work for this service or not so yeah it's not like an across-the-board at all some services ok some service is not ok that type of thing Thanks so I think we're gonna have to get ready for the next session so again I really appreciate and I'll be around [Applause]
Info
Channel: Reactive Summit
Views: 21,823
Rating: undefined out of 5
Keywords: Reactive, Summit, 2018, Event Sourcing, CQRS, Microservices
Id: wBvH7foXXUY
Channel Id: undefined
Length: 56min 12sec (3372 seconds)
Published: Sun Dec 09 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.