Gopherfest 2017: Event Sourcing – Architectures and Patterns (Matt Ho)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right I'm going to talk about event sourcing and go this is going to be my journey into it my name is Matt ho I mean technologist I've been in the valley for a long time mostly with with both small and large companies if you guys are familiar with city car share I built the technology for that back in the early 2000s more recently I started a company with some friends called Keaton that was acquired by Salesforce ten years or so ago so I've been working with both large and small companies and really like many of you I've kind of had a journey that's led me to event sourcing and I thought I'd share that with you guys if you rewind the clock to the good old days way back when most application architectures look like this you had an app server you had a database life was good it was really simple to understand it was easy to scale up when you wanted to figure out does all this stuff need to belong in a transaction transaction boundaries we're very easy to maintain there were a lot of pros to it but unfortunately it wasn't all pros as you guys know over time because of various reasons most monoliths that I've seen eventually turned into this right very difficult to work with calcified interfaces not to say that they're not good in some places but it seems like as a community we're moving farther and farther away from that but that notion of a app server stuck to a database is a really powerful notion very easy to understand these days I think we call it a bounded context right so rather than having all the applications stuck together we realize they're ok great we can have a defined portion application stuck together you can now in its own database and we'll call it a bounded context and if we stick a lot of them together we get this thing called micro services through ah micro services for me made life a lot easier and that it was teams could move more independently from another and not have to worry about stepping on each other as much but my experience was as I fell in love with it I also saw there was some negative size to it too and the challenge is now you've got all these services that need to find out about each other that need to communicate with one another and all of these has to be done in a real-time mode so if if one of these servers is not happy then the whole system gets unhappy right so if one of those goes down it causes the backpressure that kind of spins right up the system now what I notice working in these micro service environments is for as much as they promise faster productivity my experiences they came with a lot complexity in fact there was a whole who's who of things that I had to learn and sort of convey to the team so they could effectively work in the micro services world so just a small list of these things right now folks had to be aware of expand contract cycles because you've got services depending on each other and how do these services find each other you know what happens is if the service is down and I need to keep going all of these have concepts that are probably worthy of talks into themselves I'm not trying to do that talk here so it's really just sort of things made me go hey that's a really lot of stuff and not only there are a lot of concepts right that when I have a bunch of services that are all required to process a request now sudden that 99 percentile which I didn't care about in the monolith becomes a real issue for me right there's a whole bunch of technologies now out services like console at CD Fabio kubernetes but the team has to be least moderately familiar with to be able to work in this new micro services world what I've seen a lot of organizations do is even devote entire teams to building tooling around this micro service architecture to make it simpler for everyone else and one of the challenges with the tooling is because it's somewhat arcane the way that everyone's set up where there's usually and I put it in here a lot of scripts that tie things together usually there's only a small number of people who actually understand this sort of thing just for a quick show of hands how many people here are using Microsoft's architectures okay so I'm not I'm not trying to pick on you guys this is my own experiences and I was just trying to see if there's another way to do this and for those people that raised your hands how many people can recognize some of the issues that I'm pointing out here okay pretty much fair number of them in the end I'm left feeling this way like really do we have to do all this stuff in there another way and so this was really an exploration to say well what what can we do I made up a wish list this is kind of what I would like in whatever sort of thing gets created and I hope well here's my wish list I don't ever want to do another upgrade early in the Linux days when it first came out I love downloading the kernel I love building my own version I love the multistage GCC compilation I was all down for that these days just give me a docker image I I don't care to do that stuff when it comes to troubleshooting I'll happily troubleshoot the stuff I build but I really don't want to troubleshoot your you know Etsy D zookeeper 8 I'm not that good at it and B even if I was I really don't want to do it I'd rather focus on delivering business value rather than troubleshooting all these things and the challenges even if they're containerized at some point it will fail and it's your responsibility for choosing the software to be able to fix the system the third issue is I don't want to pay a fortune for it I don't want to have to spin up a bazillion instances I don't want to have to manage much of servers I don't want to have my finance person come to me and say why is your AWS build this I often work with a lot of junior developers and complex systems are very difficult to handle so I want a system that's easy to explain to someone where I can basically set them aside you know for a few minutes and say here's how it works and they can be off to the races and lastly while I would love to have the problem of scaling I don't want to think about I want somebody else to deal with this so this is kind of my wishlist going into the this event sourcing world probably one of the big jumps for me to getting into this was a project that my team did last year and we did this for t-mobile as part of their t-mobile Tuesdays t-mobile wanted to deliver stock to their customers as part of their t-mobile Tuesdays program the challenge was there was no system to do this and we had about four months of the launch of the program and so we had to build from scratch a stock distribution accounting system in four months and have it robust enough that we'd be comfortable with the sec sonora as well as any for reputation that t-mobile may have and as you might expect we forego the traditional micro services approach who went with an event-driven approach the project launched on time it was very smooth t-mobile itself was super excited and my big takeaway from that is there really is something to this event-driven stuff this system was built on top of DynamoDB there was no sequel within the system it was just a series of DynamoDB calls and that really kind of opened my eyes what you can do when you just give when you surrender to AWS and just give in and do it its way so the heart of the talk what is the vent sourcing let me start off and use a domain that we're probably all familiar with e-commerce and I'll just use this as an example so the example we work with and we'll call this the bounded context it's just a simple one it's an order with an order item I'm not even go through the order items I'm just going to kind of show you this as a general concept most of you are probably familiar with something looks kind of something like this right if I were to go and design this in a traditional relational database micro-service architecture I probably end up with something like this I'm hyper simplifying so forgive me if I'm leaving out a bunch of stuff in the database there's probably a order ID someplace and when the order is created there's probably some state on it that says order created and when I update it to say hey this order is approved it's now ready to go to the next stage what I'm going to do is I'm going to smash that state within the database and replace with something kind of looks like this all right so now we've raced the old thing we've got this new rule updated the record now looks like that and when order get shift we're going to do it again and now it looks like that now this is probably familiar to everyone one here well if we were to do this in the vent sorest way what event sourcing says is instead of having the database being the current state of the world how the database just be the summation all the deltas that were that went into creating that current state and then you can replace a current state whenever you need to ask developers how many people here you get okay and how many people knew the git was an event source system okay so with every emergent control system what it is is just a series of changes that you save right every one of your commits and to get to that local when you do a check out or a clone or whatever all it's doing is replaying those commits in the same order that they arrived to re obtain your filesystem right we're going to do the same thing then but instead of with a filesystem we're going to do with this bounded context object so we can create events that look like this so our first event was order created and you can pretend that there's plenty of other fields that are associated with the order created I'm not going to bother to go into that I'm just going to kind of put it up here to make this easier and we can say the first event was order created followed by order approved followed by order shipped and these three events together when you kind of apply them in the right order will give you the current state of the order which is the order was shipped how many people here have done or played around with functional programming okay great you're gonna understand this done so effectively then the current state of the application is nothing more than left fold of previous behaviors you just wrap them on each other and how high you get the stuff I'm going to switch over just kind of show you this in code now Wow Oh up there this is me weird [Applause] okay that is so weird all right great okay so I have a little bit of go code here and again I'm hyper simplifying just to illustrate the point where's mouse okay I'm hyper simplifying this just to illustrate the point and so what I've done is I've created three struck order created order approved order shipped again you can add all the properties on to it you want I'm just leaving them out for simplicity of the model I've included a struct here which implements what I use for an event source like the base and I'll just go into it really quickly ah too small so the model itself just has three things it has the ID the version and when this happened so it's very similar to the the diagram that we had to do all right so here are the objects I'm now going to create this bounded context um I could have an array of orders order items within this order but again for simplicity I'm choosing not to I'm just going to have the ID of the order the current version numbers on when it was created was updated and the current state of the state so what does the left fold look like in the go world I'm just going to do a simple switch statement like this so every time I receive an event I'm going to switch on the type and if it's an order created I'm going to do some stuff if it's order proven I'm going to do some stuff order shipped and you can make this as complicated as you like but what I like is for the simplicity case I didn't have to think about Oh our MS or object mapping or anything like that it's just straight go code which I feel is very much in the NGO ethos so it's pretty clear to see what's going on here and now if I go down a little bit farther I'll just kind of show you then the working example of this so it actually all fits on the page whoo all right so we're going to make up a fictitious order ID one two three and we're going to have three events the order was created the order was approved then the order was shipped will instantiate a instance of the order and then we'll just apply these three events and at the end we'll print this out and we'll say the order was bla bla bla on this let's go and run this okay that's all alright great so you can see the order was shipped on that date at that time right nothing really magical about this code and what I like about that going back to the kind of my wish list is I feel I could take someone show them this they could kind of understand follow along if they had to add a new event they could add that without too much pain all right let's see if we can go back so what have we done we've basically just shown these two things we showed we created a series of events and that order object is something I'll call an aggregate aggregate is a term from the domain driven design context and all I'm doing is just applying these events to the aggregate to receive current state well how do I get the events so to get the events I can use many of your programming familiar with this pattern a command so the command when pass through handler generates the events all right so let's take a look at that then ah Oh okay what you'll see is uh this thing is basically is the same thing as I had last time water created order approve order shipped and what I've introduced now er commands and again I'm leaving out all the details of because they're just not important to the illustrative miss so I created two commands one called create order and one called approve order and as you might expect they're paired up with create order results in order created approve order results in order approved so here's what the logic looks like and again you know it's it's all within this this one view on pre order I create a new instance of the order created event and I return the list of events so again the command go through Handler and this is the handler and it emits just zero more events at the other end when the order is approved you can see here I put a little bit of logic and I said you know what in order for the order to be proved the state has to be created so I won't approve a already shipped order or an order approved order so this command handler gives me the opportunity to kind of reject the command you know Nate's a validation error business domain error but like the event processor it's just straight go code there's no magic there's no funny business going on it's very easy to understand so putting it together and here's the event handler from before so now let's go take a look at the main so in the main we're going to create the order so again here is my order command and I'm just going to tell the order object to apply this command and what I get back is a series of events and I'm going to apply those events in the order that they received and just making sure that there was no errors on application then I'm going to approve the order yeah I'm going to create the approve command apply it apply all the events that came out the other end and then run it and see what I get so when I run this do do all right we see that the order was approved which is what we expect and here is the time fantastic my only problem with this is a lot of boilerplate code here to reapply the events you know there must be something better we can do to make this a little bit easier and it turns out we can so in this package that I have this eventsource package oops I have this notion of a dispatcher and what the dispatcher does in essence is basically that code we just went through before which is it executes the command takes the resulting events that happen and applies them to the the the aggregate the order and so what we've done is now that code that seemed boilerplate between these guys or is gone away and the dispatcher basically has the same interface as the order does I could just say dispatcher create the order dispatch approve the order and when I run this thing just like before I get order approved on the state so fantastic so far we haven't done really any magic all of it's just been straight go code let me get back into this thing all right okay so so far everything we've done is been local to my laptop and it's Fallwell of fine for it to be local to laptop but what does it look like what I want to try to put this thing into production where does the data go what do I have to manage so if we go back to that microservice issue I don't wanna have to manage stuff I just want it to go someplace well as you kind of softened me going forward in the next slide we had a lot of really good experience with DynamoDB it was surprisingly robust and so we figured we would try it with this as well so DynamoDB in just a quick show of hands how many people here are familiar with it almost everyone so I won't bother explaining then well I'll do a quick explanation dynamodb is basically a no sequel database provided by Amazon I like all other Amazon products it has this great capacity just to scale up as you turn the number up and you pay more or scale down as you need less capacity so what that means is we can take this which is the order created order approved order shipped so this was a relational database you can imagine we'd have an ordering the column a version column and then the data column which might be a blob of whatever data we have and so if we have three events we would have three rows in a database you know if you had 10 events you'd have 10 rows so on so forth well it turns out because DynamoDB is a no sequel database you can play a little trick here and the trick you can play is most of these events are going to be very small if you have an event that's a couple hundred bytes I would say that's pretty big but most these things are pretty small it's right at a time a day to view things in their a DynamoDB item which is what they call a record can store up to 400k of data which is pretty big so what happens instead of making each one of these a single dynamodb item what if we just kind of smash these together and what if we took the DynamoDB item and we put a bunch of events in the single record if you think of most objects like a I'm really familiar with the fan and the thin tech space and so brokerage transactions a typical transaction doesn't go through that many state changes maybe a dozen two dozen oh so most of the time using DynamoDB your entire event stream fits within a single dynamodb record which was a big plus as we were thinking about this so as you end up with more and more events you can imagine your DynamoDB table being shaped a little bit like this so here is four rows the first row will call this page 0 has events 1 through 99 page one has events 100 300 99 so on so forth so you can cram a metric ton of events into a very small number of DynamoDB records which is a pretty neat thing so what does it take to actually make this happen all right let's go to another code example all right here is a what's wrong what that slide where is the mouse there we go okay you guys we hate this example after a while alright so this example no there we go so this example is basically the same one I've been using okay there's the command handler that we have before the on everything is pretty much the same but now I've done something different I'm using this package that I put together for event sourcing it's very simple I'm not going to go into too much detail about it just kind of talk about the high-level so it has the ability to use dynamodb as a store so here I'm saying in US West - there's a dynamo DV table called orders then I'm going to use as my store and then I'm going to do a little bit of bookkeeping to basically set this thing up so I'm going to create the repository where all my events are using this store and then a serializer that knows how to serialize these three objects so again from the previous one here's my dispatcher and everything at this point is exactly the same right I'm creating the order saying it's a dispatcher checking it approving it saying a dispatcher the only thing I've added here is this load command which allows me to basically ask the repo give me the current version of this thing of this object with ID the ID I pass in and what it returns to me is an aggregate and aggregate is basically because I thought interface I want to put thing a little bit better than just open closed air face so it's really just anything that accepts a non method so I need to typecast that back into the original order object which I do here and then we'll print out the same thing so before we do that let us most figure out to move this thing around ha okay so here's the orders table in u.s. West - I'll just do a quick refresh Thank You rid of this okay so I'll do a quick refresh and you'll see there's nothing up my sleeve and now we'll go ahead and run that command so huzzah order approved on this time and when i refresh this little sucker here's the key it's just some random grumbling good partition zero and here is the base64 encoded version of the events in case you're wondering what's inside there I'm just going to crack it open and we'll paste it in here which is a base64 decoder decode it and what you can see is it's just adjacent object alright just a very very simple JSON object okay so now we've got a place to put these events that's very scalable but it's still kind of in this room ethically sealed world we can create events we can stick them in there we can retrieve them back but how to connect to the rest of my application where do those pieces go turns out DynamoDB has another fantastic service and it's called DynamoDB streams this is connected to DynamoDB such that every time you make a insert update or delete to dynamodb an event gets created and thrown into this thing which you can consume and do something with when I saw that and this was for the the previous project I was feeling like this is fantastic you know so why is it so fantastic and why was I just incredibly estatic well let's see what we talked about we get an application but saves instated to DynamoDB DynamoDB then we'll dump it stuff out to streams and it turns out streams um can connect to lambda really simply and here's where I think go just shines so much because the go static binary because of its low resource consumption because it's very quick startup time it's a fantastic candidate to use and lambda functions through the 8x package which I highly recommend we've been using go with with lambda for a while and just be hugely I feel successful with it I've talked to other teams of try to use Python or node and oftentimes the challenges you hear is how do I make sure that all of the dependencies I had locally gets shipped up to the lambda container and work there I don't have to worry about that with go what I test locally is almost it was exactly what runs remotely so just that simplicity makes my life a lot easier I can't imagine another environment to do this in so what does it mean now that I have in lambda well one use case which i think is the most one of the most powerful use cases is I could take that stream of events reassemble them and ship them into fire hose fire hose is a awf service that basically will take your stream of data and throw it someplace the someplace in this case is s3 and what it does is it creates a directory structure a little bit like this for all the events ago and it's not go look exactly like this but just think of this as you feed events into it and it throws into this director structure in a kind of suit of timestamp so you can follow them by time this is a wonderful thing all your events at the beginning of time are in this directory has anyone here heard or used Kafka shows hands okay so Kafka is something we explored using right it's a basic place where you can get this ordered series events that you can playback from a pricing standpoint relative to this for this I pay what is it twenty point two three cents or two point three cents a gigabyte to do this right in s3 and it's highly redundant it's super scalable etc when on you in Kafka I need to stand up a number of replicas each of the replicas or ews instances they're backed by EBS stores I need a zookeeper there's a lot of servers involved I know by the way even though Kafka is very reliable I need engineers who understand how to use that on the other hand I could just throw it in this bucket why would I want to throw it in the bucket if I just take that bucket and copy it to my local directory I have all the events that the system has since the beginning of time how hard is it to take a directory on your local file system and just run all the events to see if your system has changed being able to regression test your application is now way way simpler being able to figure out if you've got a bug in the system if I need to debug I can take any object I can just see the series of events that got it to today testing is a lot easier and because s3 itself also supports lambda I can make trigger so that if a new event stream object arrives and I want to do something like a account or send out an email I can basically watch the new I can watch Kinesis Azure firehose and delivery into s3 trigger off that to do something else like a calculation email whatever have you and then if I put cloud front in front of s3 I could do hyper fast pulling so if I want to replay a large number of over and over again I don't have to worry about really bogging down s3 so just to kind of wrap up I think fire hose go to a lot of places if I need a big staple store ignorance to redshift if I need to build a query it it goes to elastic search I can write more lamda converters to base you convert the type of the data so it's whatever shape I need for elastic search or redshift here's one of my favorites though I use a lot lambda 2 s and s sqs so if you take the event that arrives let's say it's the order shipped oftentimes you want to have a side effect process that happens like sending out an email well if you think about traditional people do it they like right to the database and then they send the email out that's okay but you have to real it sometimes your server is going to crash and that email is not going to go out you want a you I would like a way to make sure the email goes out at least once here because I can just take the event throw it into it s nsq something can subscribe to that to send out the email so it's a great place like I say the handle side effects you know if you want to cross down to context it's a great way to send things across it's great for visibility when things go wrong so what we found with that t-mobile system is when something breaks the way a break looks is the sq sq in front of something just starts building up and it's really obvious what's broken because everything has a cue for it and you can just use the the built-in dashboard figure what's going on I'll just throw other things you know lambda is going to continue to grow I have no doubt that you know there's more and more things well connected to and the key thing is having the event in the wrong raw form is really kind of what enables this so going back to my wishlist and what kind of got me started here I don't ever want to do another upgrade I don't know if you'd build your entire system this way but at least for this section of the system I don't have any pieces that are my servers in here except my app server all right let me just go back to the diagram I got my app server I got lambda everything else is ran run by AWS I don't have to troubleshoot Amazon there are people that do troubleshooting or way better than me I don't want us to pay a fortune for it because I'm not really running that many instances anymore when I looked at the bill for t-mobile the t-mobile Tuesdays I was just amazed lambda cost almost nothing and a top of it you get hundreds of thousands of free invitations so it was just fantastic I feel like I looked at my ec2 buildings like this and everything else was like this and I was like I want more of this less of this and also for the juniors having a way to get them involved creating the events modeling that is a more senior activity but once that event stream is set up if you tell the more junior developer hey can you now send an email when the can you also send an email when the order was created right how would they do that well there's a little vent here that says there's a queue that says what are created listen to this queue subscribe to it send out an email right all of a sudden you've taken these tasks and instead of having to understand the whole application infrastructure how it deploys getting docker compose to run locally and all that stuff now it's just read from this queue and write this thing and the last one is the thing I get for free because it's based on Amazon and there's no servers of my own if I want it to scale I just pay more I'm not going to have to react attacked my system well if you get to like Facebook Twitter scale you're going after we architected but whatever I mean that I can only wish they had that problem all right thank you you know what I'm not going to put the pressure on you first this time San Francisco cuz we have a few questions online we have more than a few questions let's start with Victor asks how to make sure events haven't been tampered with what's the security model around s3 as an event store so the great thing about s3 is like every AWS service it's subject to the ion policies of permissions so one of the other things I like about this architecture is I didn't have to go build anything to create permissioning Amazon has a great permission model and for s3 it's very very specific you can even create an ion policy that's down to the bucket prefix so I think there's a lot of built-in stuff that happens on a sport all right Demetrios I hope I didn't butcher that Demetrios ask why DynamoDB instead of Kinesis sure so ah it turns out it is partly Kinesis dynamodb uses Kinesis on the back it's if you look at DynamoDB streams it's really an early version of Kinesis so why not Kinesis directly it's really the read after write problem so a DynamoDB I can write my events and get a consistent read immediately if I write it to Kinesis I have to wait until it goes through the Kinesis machine before I can do my consistent read victor follows up with are you considering a dedicated event store database like event store or eventuate um so here is my personal take one of the goals I had was I don't want to have to manage a thing I don't want to have to have a person who knows a database if I can get Amazon to do it um I think those databases are fantastic there's a lot of great features but what I'm looking for is I just really want to surrender to the cloud if you will yeah and just have Amazon do it and a vet sourcing at least how we're using it here it's such a simple model that I I feel pretty good about this way oh okay all right a one two that's it but I'm going to start back there because I saw his hand first brute Perry saw one too and I don't make me feel guilty by raising your hand in addition nope I'm not looking hi so I was wondering how does a replay work especial with large amounts of events right so you have multiple partitions right and you need to apply all of them that's right every time yep so I'll just quote Greg young Alice for the most part like until you get to a thousand events don't worry about replay like it will be just fine if you think about the number of objects that have a thousand events it's really really small and usually it's not in a space that I'd had to work with oftentimes most things that I've created they have well under a hundred different events and like it I was saying because we're smashing so many events into a single DynamoDB record it just turns into a single DynamoDB request you are mine yeah it's kind of a follow up perhaps a naive question but if the events are really separated by time - they're ten years apart I will be able to download all that eschaton run it back on my laptop right no no so if you really have that much stuff so if you're talking about a single event you can absolutely just pull the single event right off of DynamoDB if you're talking about going back and have that much stuff you can't hold it all you can create a snapshot at some point all right and the snapshot would be like when the counting system does end the day here's the end of day clothes you don't have to worry about the time beyond that and very last question how do you handle sharding of commands - aggregates why do I so can you have corporate why do I need to shard says can you actually handle sharding of commands to aggregates aggregate so I think I I think I might understand what you're saying um I'm going to interpret that question as how do you deal with multiple command handlers being invoked at the same time potentially causing conflicting requests but one of the wonderful things about dynamodb is when you do an insert update or delete you can actually create conditions on the insertion of the request and one of the conditions is you can do optimistic locking effectively so if two people try to insert the same you know event three four five dynamodb will pick one that wins the other one will get a failure and have to try again very similar to our traditional database works alright thank you so much man
Info
Channel: The Go Programming Language
Views: 20,094
Rating: undefined out of 5
Keywords: golang, gopherfest
Id: B-reKkB8L5Q
Channel Id: undefined
Length: 38min 49sec (2329 seconds)
Published: Thu Jun 22 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.