Concurrent Programming with the Disruptor

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'm Trisha G I'm a developer L max a financial exchange in London we created an open-source a concurrency framework called the disrupter I'm also a leader in the London Java user group and a blogger of anything that pops into my head first up has anyone actually heard of the disruptor oh well most of you might not need to be here then that's fine and so has anyone made any blogs or seen any other presentations okay cool and as any why she played with it right you guys can help answer the questions at the end so I'm going to give an overview of the disruptor for those of you haven't heard about it before I'm going to talk about what it is why we developed it how it works under the covers I'm going to go over some of the code to help you to create code up to write the code to use the disrupter and then I'm going to go through a crazy use case of that's like not really all that realistic but will show you how to model your domain using the disrupter and then there's plenty of time for questions and answers too because especially given some of you guys know about the disruption some of you guys don't then the questions will really help drive what you guys want to get out of this session so firstly what is the disruptor it is and it's very basic bottom line I guess it's a really quick way to pass events or messages between threads okay so it's a data structure with pretty much no contention of the most circumstances and it lets you and pass objects between different threads it's a very quick way of sending things I mean it allows you because the way the data structure is structured it allows you to go truly parallel in your architecture we'll go over what that means later on so yeah so basically I just gave you a load of words but not really any context of what it is at the heart of the disruptor is the magic ring buffer which is not magic but it is a ring buffer so at least I've got half way there and whenever and we're Lmax whenever we start putting disruptors into the architecture the first thing we draw is this doughnut right the middle of this and that kind of represents what's going on in the disrupter and it's it's very simple it's just an array and it's an array back ring buffer a lot of cues a lot of bounded qz is exactly the same mechanism to store data and it's nothing particularly special and on top of the magic ring buffer which it's it's got its own sequence somewhere but I've come to that a minute the you have publishers which obviously push stuff into the ring buffer and event processors which read things from the the ring buffer so so far there's nothing particularly interesting or exciting about this let alone special so first up everything in the in the ring buffer has a sequence number associated with it that's the equal number is really a very simple way of being able to figure out which slot you're looking at because you just do a mod of the sequence number to find out the slot that the event is in and but hopefully we'll come into that in more detail later at the beginning the sequence number points to the first slot which is available for reading from the ring buffer so if I write something just one thing into the ring buffer it gets written that slot 0 which is why slot zero is colored in and my sequence number is set to zero which says anything that wants to read anything from the ring buffer the first place you're going to start is the thing at 0 as I start adding more things into the ring buffer this sequence number obviously increments that's not really anything particularly special about this when you fill up the ring this is the kind of bit where it's well that's the circular buffer it just carries on incrementing the sequence number increments so pretty much indefinitely and it's it's a long value so you can have an awful lot of numbers before you start to run out of sequence numbers and this thing just keeps going and keeps rapping as you keep putting stuff into it your sequence number just keeps incrementing creating one of these things using the disruptor is fairly straightforward you just ask for a new ring buffer you give it an event factory to create the things which are going to go inside the ring buffer and and the size of your buffer which incidentally should be slightly more than 12 because that's kind of a little bit small for a buffer so coming on to what what's actually inside the rain buffer itself these things which we call events these events are buckets for you to write things into it's not necessarily the same as a cue where you push things onto the cue and consume things out of the cue it's not quite the same mental model the the ring buffer is a-- is a fixed size and the things in that ring buffer are just fixed buckets for you to put stuff into this is this is really friendly on the garbage collector because it means you're not constantly creating throwing away i'm objects it's just they're just going to stay there indefinitely so it's much more it's much more performant this way so the reason why you need is your event factory at the beginning is this event factory is used right at the beginning just to pre allocate to pre initialize these buckets when you first start up your ring butter i'm it makes the api look a little bit odd because what you're doing is you're passing around references to this event and then to the event the bucket in the rain buffer and then you're and you're manipulating that event rather than asking for things off the ring buffer and pushing things onto it but hopefully when we go over this a bit more I should become a little bit clearer so you can put any sort of event inside your ring buffer it's nothing special it doesn't have to be any particular type there are gonna be types which are more performance and types which are less good for performance so it depends a bit upon what your use case is but it doesn't have to implement anything or override anything you can have anything so in this particular example I just use a simple event which is a bucket containing a string value my simple event factory just uses just users new to initialize that right at the beginning but you want you'll initialize that with anything which is your blank initial State for your thing in the ring buffer and then that simple event would have getters and setters for that value so now I've got a ring buffer and some buckets which will contain some stuff but the moment more importantly how do you put stuff on to it the destructor itself contains a classical event publisher which will help you publish things into the ring buffer all you have to do is provide it with an event translator which will tell which the publisher will use to be able to write stuff into your own buffer this thing well let's go through the way it works so the event publisher asked the ring buffer for the next available slot - right - now in this example the sequence number is set to 547 - some arbitrary number and so when it asks for the next available slot it's going to get 548 then it says to the event translator do whatever you need to do to put a new event into the bucket of 548 so you pass it the event object itself and the sequence number associated with that new event the event translator will do whatever it needs to do to write the new values into this particular bucket and then when it's finished the event translator will call publish so it's just a simple two-phase commit grab the right event send it to the translator get the translator to do what it needs to do to write something into there and then call publish once publish has been called then the sequence number will increment saying okay the next available thing to be read from the ring buffer is now at 5:48 up until that point nothing can happen to the thing of 548 not until it's published can anything do anything with it because it's not finished yet okay so again this is fairly straightforward your event translator just needs to implement event translator interface it takes it's all generic so you can take the simple event you know object and and then you just call the publish event with the translator when you're doing this this code this example code is available online so if when you want to peruse through it later on in a slightly more useful fashion then you should be able to get it online so now I've got a ring buffer full of stuff and I need to be able to read stuff from it now because the way it works with with the sequence numbers queuing up the sequence numbers being the kind of key to everything that's in the rain buffer you can actually get some nice batch change behavior from the ring buffer so let's walk through this a little bit the disruptor itself has a batch event processor class which will just handle a lot of the heavy lifting for you so the stuff I'm going through at the moment it's really kind of the underlying workings of the disruptor that you don't really need to understand when you're really programming with it but it helps have a good mental model of what's going on under the covers so that you can use it in a kind of useful fashion so much mint processor is provided to you by the disruptor you just need to give it an event handler to say when a new event is available in the ring buffer what do I need to do with it so let's say we've got a ring buffer with it's got things in slot 0 1 2 3 & 4 so the next available thing to read from the ring buffer is the thing that's not for now my batch event processor hasn't read anything yet that's why it's got to slash through the sequence number so effectively it's got a sequence number of minus 1 haven't seen anything yet so the batch event processor is going to say to the ring buffer I haven't seen anything yet I'm expecting the thing to be in slot 0 give me the thing that slot 0 and the ring buffer says to the batch event processor will actually aren't full up to 4 so you can have everything up to 4 quite playfully because it's all there ready for you to get so then the badge event processor basically grabs the thing at slot 0 passes that on to the event handler grabs the thing in 1 passes that down a 2 3 4 and so forth it just passes them through really really quickly to the event handler one at a time not in an array not in a blob just one event at a time and at the end of that if there's a there's a flag actually on event 4 to say as the end of the batch so if you want to do anything with at the end of the batch like flushing or committing or anything like that that's the point in which you want to do that stuff now only when you've handled all of that batch will the batch event processor then update its own sequence number and it's it's the fact that it has its own sequence number that can lead to some of the interesting parallelization a little bit later on but at that point anything that needs to know what that batch event processor is doing where it's got to can read that sequence number and go oh you've done everything up to number four great so I know that you're finished with everything up to number and all you need to do is implement the event handler so you just implement the same interface here as I said you you have an event which is the same event as the bucket on your ring buffer you get the sequence number and you get a flag set to true if it's the end of the batch so you can do any any batching that you want to do okay so that's the basic stuff that most of you who've come across the destructor probably come across that before and hopefully is a good intro to those who haven't seen it yet at this point I want to stop questions before going on to anything a little bit further so are there any questions yes yes no there isn't the question was there seems to be that there's a possibility for concurrent reason right yes but what I'm going to go into is how you avoid that problem yes no I mean this is the it's the simplest way because you get this nice batching you can write your own if you want to and I think there might even be one or two others bundled with the disruptor the badge event processor is a good place to start because if you want to write your own event processor then you'll just have a look at what it does you can you can talk to the ring buffer directly and ask it for the next thing if you want to yep so can you have more than one event processor that's exactly the case I'm going to go into yep yep yeah hopefully that will become clear yes if it doesn't ask me again later yep I'm sorry I didn't hear that at all something about performance implications optical oh yes it depends a little bit on the amount of RAM you have so I mean the thing is a lot of these things are dependent on obviously your hardware and the rest of the stuff is going on in your system so the only way to really find out how how big you want the buckets and how many of them you want um is dependent on on your system so you have to do testing with production like data to see what that looks like yes what about the overflow case excellent I'm glad you asked that I will come on to that okay cool I will move on and if I don't answer some of those questions adequately then just might poke me again later now has anyone seen this diagram before how comes now I never seen this before fine I'm the reason I put this up this cuz it's on a bunch of blogs and papers and everyone gets confused by it so usually I put up here and say have you seen this before and then if they ignore it so let's ignore it and the point is this is our usual demo for how you go parallel how you have multiple event event publishers not publishers you can tell I've got a cold right and then processes and the fact that you can go parallel and you can use multiple disruptors together in in harmony but we're not going to do this example we're going to do something a little bit different right anyone know anything about cars I used to work for forward so it's quite easy for me to come up with analogies of cars and most people understand or at least have some sort of mental model about how an assembly line works with cars especially if you do anything around agile or lean or any of that stuff it's always talking about manufacturing and how are you pulled together cars okay so we're going to take an analogy of a car on an assembly line and stretch it hideously to fit the disrupter which may or may not work okay so we're going to build the world's it looks like the world's worst car and it looks a little bit like my first fiesta once you've got the chassis of a car which is what that was supposed to be when you fill the chassis of a car then you can have a number of different operations going on that car in parallel you can have someone putting the engine in you can have someone putting the seats in the front seat the passenger seat the rear seat and all those things can go on in parallel and they don't actually interfere with each other okay so when you're putting in the engine it shouldn't really interfere with what's going on on the back seat and then once you've done that you kind of reach this milestone of like your interiors kind of complete anything which cares about anything which needs the interior to be complete before it can continue can then think about having a go so next up when the interior is complete then we might think about putting the four doors onto the onto the car and the bonnet or the hood or whatever on to on to where the engine is okay all my little dudes really love their job they've got really smiley faces and once the doors on the bonnet are on then the exterior of your car is complete and then you can think about doing things like painting it I keep meaning to draw a smile on that guy's face I forgot to put a smile on it it's not he's like the least happy of my workers I'm once the cars painted then you could think about putting the wheels on now there's no point in doing this in any other different order there's no point putting the doors onto the car before you put the seats in it's just going to make your life much more difficult you know there's no point you certainly can't paint the car until you've put the outside of the car onto it okay so you know there's a workflow you know there are dependencies you know there are things which can happen in parallel and things which absolutely must wait for something else to happen before continuing only when you've done all of those things you're going to get a complete car and what I normally do I completely forgot is draw this workflow up on a whiteboard but you guys probably couldn't see at the back anyway effectively what this means this you've got a workflow which is reasonably complex but it is understandable okay so you you've got you start with your chassis you put your engine in you can put your engine your driver's seat your passenger seat your rear seat you could do all of that in parallel without impacting each other but those are the first things you have to do before you do anything else once that's done you can put the bonnet onto the onto the engine compartment when the engine is in it doesn't care anything about the seats you have to have the engine in first that's the thing that's the only thing it cares about in terms of the doors the doors have to wait to make sure all the seats go in before you could put any doors on the car when all of those things are done then your body is complete then you can paint it and then when the painting is finished then you could put your four wheels on okay I'm getting the fact that you understand this you just don't know what this has got to do with with the disrupter I'll get there hopefully so the point is what we're going to do is we're going to take this kind of reasonably complex workflow and the fact that you've had to understand the dependencies and we're going to wire it up using a disruptor in a slightly like artificial fashion I mean this really is stretching the analogy but the point is to get an idea around your mental model and to to do that using the disrupter so going back to publishing into your ring buffer in this case the the chassis is your event the chassis is your bucket that you're going to attach things into okay and this chassis has different properties for example they could be I'm going to use Ford ones I'm not even sure if the if the Ford models are the same in the US but you know you can have fiestas focus's Mondeo's and you can have different shapes and different sizes of shutters which go in there and the event translator the chassis translator it's its job to poke into the chassis itself to say this is a Fiesta shapes chassis and then publish that into the ring buffer next up we can have a batch event processes with event handlers for all of the first set of dependency so we've got an engine handler driver seat handler passenger seat handler and rear seat handler we have a sequence barrier which we'll walk through exactly what that does later but effectively that allows all of those event handlers to find out which chassis is available for them to read from the ring buffer okay so the engine handler needs to know we'll all of these guys all they care about is there is a new chassis in the ring buffer that is wet waiting and ready for them to put an engine in or whatever so the thing that the mechanism they use to find out what their next available thing is is this sequence barrier in this sequence barrier inspects the ring buffer sequence number to find out what is the most up-to-date sequence number that can be read from the ring buffer okay next up there were two different kind of milestones after those four things have happened the first is the fact that the engine is in there and it's ready for the bonnet or the hood to go on to the engine compartment so this is this first sequence barrier right at the top this thing is making sure that the engine is in the chassis the second sequence barrier what it does is it's trying to make sure that the interior is complete the whole of the interior its job is to tell anything downstream of it which Sassy's have got all three sets of seats in the passenger the rear and the driver's seat so we can then wire up our other handlers so our bonnet handler the thing which is going to put the bonnet on the engine compartment looks at the first sequence barrier and then further down we've got the front door handler front door one and two handlers and rear door one and two handlers and those things are inspecting the second sequence barrier to find out which chassis has got all of the interior in next up we have another sequence barrier which represents the last milestone which was that the the interior is complete so the paint can finish so the job of this thing is to make sure that the all of the texture is kept it so that all the doors are on the bonnets on and it's ready for painting so it inspects the sequence numbers of all of those handlers to find out which chassis in the ring buffer has had all of those operations applied to it and then there's a paint handler and then there's some wheels and every time I give this presentation I wish I hadn't put wheels on the car because it just is just a bit too big frankly it's just too much so the point is at the end of all of this you basically have a series of event handlers for every one of those independent actions that can happen on your car and a sequence barrier for all these sort of dependency points to figure out which things have to have happened before your next set of event handlers can can happen okay so we're going to talk to an example of this because so far all you've seen there's a lot of boxes on the thing and I'm getting a few puzzled looks going out I get it so firstly the first thing that can happen before you can do anything else is you have to put a new chassis into the ring buffer so much SP translator in this case the ring buffer has got a collar in there the ring buffer has got cars in their scope that has seen 37 carvers processed 37 cars sequence number 37 the event publisher asks for the next available slot - right - which is 38 this Shafi translator will write the new new chassis if you like into the next available slot so it's not going to do much with this chassis all it's going to do is like stamp it with a again this is artificial stamp it with a VIN number and a type it's a Fiesta shape chassis it's gonna put some sort of tag in there it says right in this new thing this is it it's also in real life what you're going to do with the real event is probably clear down all the stuff which is in your event because don't forget this a bucket and it's got all sorts of rubbish in it clear everything down write the new value in there once the chassis translator has done that then the event publisher will call publish and the sequence number increments to say the car at sequence number 38 is available for putting stuff into it next up we have all of our batch mint processes have got different independent sequence numbers so when we saw the car we were looking at one being built at the same time we could see like four guys could all girls could work on the car at the same time now the other thing is each one of those guys or girls can actually work on multiple cars at the same time so my engine guy he can go and put an engine into this car this car this car this car and it doesn't really matter what happens with the drivers guys right a travesty two guys or any of the seat handlers is completely independent of that so he tracks his own sequence number so the engine handler has got a sequence time of 33 which means it's put engines into every car and the slot up to 33 so when we look at the ring buffer we can see that anything any of the cars up to sequence number 33 are going to have an engine in them anything that cares about that just cares up cares about things up to sequence 33 similarly so the the seat handers have got their own sequence numbers too so the driver's seat handlers at 36 passenger seat 37 rear seat 35 the point is these things they're actually running in their own independent threads and they're managing their own state their own life cycle so they can be running independently of the other guys because like we said the seat guy doesn't care anything about what the engine guy is doing so in this case let's say the engine guy says I'm waiting for sequence 34 because I'm a 33 and the ring buffer or the sequence barrier says well 30 38 is available which means all of the slots from 34 through to 38 are available for writing to and don't have engines so the event the engine handler is free to put engines into 34 35 36 37 38 in a nice batch so the same sort of thing happens with the seat handlers yep I'm sorry I can't hear you at all yeah yes yes so the question is that I've implied that you can put a certain you could put different types of things in fiestas and fusions and Mondeo's and the event handlers are going to have to understand the fact that there they can be different things so for this particular example let's just assume they're all fiestas but there's no reason why you can't have you could have two engine handlers for example you could have like a diesel engine handler and a petrol diesel petrol well guess whatever engine handler okay and these guys could inspect that that car and say oh I know that that model of car is the petrol so I'm going to do I'm gonna do my thing or it's going to go oh actually I don't need to apply myself to that so if you want to get two more if you really want to shard or split some of your dependencies you could have some of these event handlers saying I do care about this event I don't care I don't care I do care I don't care okay yep sorry they can do but what happens is say in that last batch let's say the engine guy is only going to process 34 37 then but it knows it doesn't really care about 38 because it's the wrong sort of engine it's still going to increment a sequence number because he's still seen it you still processed that even I didn't do anything to it okay so the seat guys are going to do the same sort of thing they're all told they can process everything up to 38 and again they can kind of carry on at their own pace these things could be doing quite different things so they could be I always say they could be going off to a database I don't recommend it cuz that's quite slow but you know one of these things could be going off to a database to find out what sort of engine to put in there they could be going off to a file to log the sorts of things they did they could be doing their own things independently so they will be happening at different speeds that's kind of the point about this they can go at their own speed they track their own sequence numbers and hopefully I remind me to talk a little bit about how you might want to even out some of those speeds if they're going at different sorts of speeds so then next up the bonnet Hamlet so let's assume that the bonnet handler is doing something quite quick so instead of going off to a database it's doing some sort of calculation based on what the engine did to put the hood back on the car so it's a sequence number 32 which he can more or less see so the engine handler is at 33 the bonnet handler is going to ask the sequence barrier I'm looking for 33 because I'm at 32 the next thing I really care about is 33 and that sequence bearer is going to say you can process the thing of 33 it doesn't say you could process the thing at 38 because even though the ring buffers got a C's in up to 38 that hasn't been seen by the engine handler it doesn't have an engine in there's no point in the bonnet guy doing anything to it because it doesn't have an engine in there and that's what that sequence barrier is for that sequence barrier is to track the things which matter downstream of that so in this case the sequence barrier says 33 is the thing you want and what you do is that bonnet handler writes directly into the thing of 33 it's it manages that event directly more interestingly is when you get down to the sequence barrier which manages multiple sequences so here we said this sequence barrier was it represents the state of interior complete so the chassis which has had all seats put into it is the lowest of those numbers so the seat handlers have got 36 37 35 which means that the only chassis which has got all seats in is the one at 35 so that sequence barrier the job of that sequence barrier is to return the lowest of those sequence numbers that it's inspecting which in this case is 35 so forwards 35 onto everything else this means that anything or any of the individual door handlers are just going to be processing everything up to number 35 even though there's a passenger seat on number 37 and there's a driver seat on number 36 that's not important the only thing which has got all the seats on is number 35 so then each of these things can can write the things can manage the events that it cares about so in this case the front door handler like it was at number 28 so it's way behind everything else lets just hear nth off for a break and he's got to catch up really quickly and he's just gonna have to manage that batch of of cars really really fast and go and put doors onto all of them really quickly and then other guys are going to go around and put the doors on to the things that cares about the next sequence barrier here I mean this does the same thing so the the paint handler let's say it's a sequence number 27 it's going to ask this sequence barrier is the one which says your exterior is complete well your doors are on your bonnet so on everything else is on the inside at this point it's ready for painting and the paint handlers are going to ask him for number thirty twenty seven and inspecting all of those sequence numbers we've got 32 28 30 34 32 so the lowest of those is number 28 which means the paint that I can paint the thing of 28 I'm not going to show them in buffer because it gets tedious and then at this point the wheels have all processed there's wheels on everything up to the number 27 the paint handler has painted everything up to number 27 so the wheels are going to ask this sequence barrier I'm looking for the thing that's not 28 and this sequence barrier inspects the paint Handler and says well you can't because there's a 27 so it's going to sit there and wait until 28 is available because there's nothing that was nothing ready for the for the wheels to go on yet and getting back to the point of how do you stop the ring from wrapping the in this particular workflow we know the wheels are the last thing which goes onto the car the wheels they're our last thing which is ever going to process anything in that ring buffer so what we do is we wire up the ring buffer to inspect the sequence barriers of the last sequence it cares about in this case it's these four sequences so the ring buffer these wheel handlers have processed everything up to 27 this ring buffer goes from 27 to 38 which means that it can write something into the slot at 27 but it can't write any further than that because the wheels haven't processed their stuff the wheels are waiting for 28 you can't be over right 28 - the wheels are done with it okay and all you do is you call ring buffer set C gaiting sequences with the sequence numbers as a bar I think so the sequence numbers that you care about so the ring buffer doesn't need to care about all the sequences downstream because it doesn't really matter what does matter is the absolute last one that it mustn't wrap around cool so I'll go after the caveat a minute questions based on that yes why did you the first sequence Perea you don't you could ask the ring buffer I mean I can't remember the exact syntax I can't remember if you need to tell need to give the batch event processor a sequence barrier in which case you ask the ring buffer for its sequence barrier but for consistency the event handlers will always be looking at a sequence barrier to find a lot of sequences yes right so how do you how do you handle transactions if things are going on in different event handlers what we do is you'll commit the transaction in that final event handler so we'll open a transaction right at the beginning we can do whatever we want and we commit it at the end it depends a little bit on what you're doing right but we we definitely do transaction handling across multiple event handlers yes yeah I mean it would be neater probably to put a sequence barrier at the very end to say construction complete or something like that and then put that back to the ring buffer you don't you don't need to do that you just put in all the sequences for it to inspect it's the same thing it just it has to look at all four regardless so questions here yes no they when you say what's the command yeah when you say when you ask you when you say wait for it we'll go and look so it's it's always getting the most recent one was it's not constantly polling it they'll be quite expensive so it's just going to go and get them the one when it needs the right one if what's not ready okay so the sequence barrier doesn't doesn't wait but the event handler waits the event handler will sit there and wait until the next available one is there yes yes okay so there's the question is how do you stop them trampling all over each other the fact that you've got in this demo I've said all the event handlers are working on the same car okay and then you're going to get contention and all sorts of problems so there's two answers to that one is then the the most important way of doing this is you only get contention when you write not when you read so the point here is that with your door event handler you'd be writing to the door field on the car you'll be writing to anything else you can read anything from further upstream so if you want to look at so here the door and for some reason you want to have a look at and what type of passenger seat was in there that's fine you know that's already been written and that's okay because the sequence numbers been updated and you know that the the seat guys finished with it so these sequence numbers are the key to knowing which fields you can safely read and in terms of writing you should only have one event handler writing to each field now it doesn't have to be because the sequence number is once you update your sequence now I'm sorry the question was to the fields need to be volatile no the feels funny to be volatile because what you do is you do whatever you need to do on a car when you update your sequence number because that's volatile and everything else gets flushed out so all of your all of your rights are visible in actual fact in the actual implementation it's not actually volatile because there's some really clever stuff they've done we're using the Java memory model and and various other operations which means you still does it effectively works as a memory barrier but yeah yes the only word I heard was exception how do you handle the exceptions yeah so um there's a number of different ways you can do that if there's no magic exception handling built-in with the disrupter but you can do things like so when I were to Ford they used to do things like they would accidentally build the wrong sort of car like there's that there's a case where they built a van with no doors on because you could put all the sides on and forget to put a door on but what you could do in that case is you can you if you have an exception handle in your in one of your handlers you could mark something in the in the car to say I failed I didn't do this and at that point you don't want any of your other downstream handlers doing anything with that car you might have somewhere in this maybe right at the end you might have an event handler which checks for exceptions and then handles them appropriately you might want to send another message saying okay this is an error message or this is an error' event and then have something downstream take care of that what you shouldn't do because all of these event handlers are running on separate threads is you shouldn't throw runtime exceptions that don't get handled because they will just die in a big fat Heat yes so in terms of how do I map do you mean how do I map the sequence number two the array indexed or in actual fact when I played around with us the handler doesn't really care about the numbers it passes them around because it's useful but all it is is a ticker to say okay go up go up go up but it doesn't have any particular like for the ring buffer obviously sequence number 38 you mod that on the size of the array and it gives you the index into the array so the ring buffer cares about the sequence number four that it has a context other than just a sequence number but for the event handlers it has no context other than as a ticker that will just keep growing up because they track their own sequence number you know so the question is there should be a map between the sequence number and what it means so in this case not really because the only mapping between the sequence number and the event which is the thing which needs to have something happened on it is the fact that it's an index into the array in the ring buffer and then you pass that's their actual object onto your event handler this is the event you cared about right so in real life was going to come back in the thing at 33 it's an actual event so you get the you get the reference into the rain buffer and it says okay you in this case the bonnet handler is getting the event at the event with sequence number 33 but it's actually getting the actual car object as well here is the car object it's sequence number is 33 do what you want with this object the ring buffer is only an array so all you need to do is you mod 38 by the size of the array which is 12 and that gives you your index into the array and that index into the array then gives you your objects you just pass that straight on yep No am I gonna talk about multiple multiple producer Kate no because that's the one case we do get contention because this is where you have two things writing you get two things writing into the ring buffer you can totally have multiple producers we have that the code is there to do it but it is much slower to have multiple producers and ideally you want to structure your code so that you don't have multiple producers it's possible that if I've spoken to a couple of people whose architecture definitely looks like it needs multiple producers in which case it might be the disruptors not the right thing for them but more often than not when you talk to people what happens is really our producers that Lmax are really really simple all they do is they take right code off the network and plunk it straight into the event and into the ring buffer and everything happens in event handlers so we're used to thinking of producers as doing something useful and doing some sort of logic and maybe you've got multiple ones because you might have you know a thread pool pulling stuff rest requests or whatever or pulling things off the wire yeah yes yeah so I think the the answer is I don't know what the answer is to that people are doing it but that's definitely when we when we do it further inside our exchange and we're doing it service to service we don't have that problem when we're doing it at the web tier and we're talking about WebSockets and so forth then we don't really have the you can't get high performance that way but you can't get high performance on the web server anyway there's this thread actually happened and there's a Google Groups site for the disrupter and there's a whole email thread about this exact problem so it's definitely worth going to have a look at that because it talks about some use cases that do work and some would don't work yes never gets removed from the ring buffer so nothing we don't consume it's not a it's not a cube so you don't put things in and consume stuff because consuming is effectively writing and writing gives you contention so all we ever do is we overwrite it so when the next available thing is is up when was that letter look so this is the point where you consume it you blank it out and write it down then you've then you pause a new wait you have to create back pressure to make sure that it's only if you want to be performance so if there's something slowing down your system say you're a ticketing website so you're the Olympic ticketing website and you've got lots of people trying to put requests in the last thing you want to do is carry on taking requests that you know you can't service because you've got a bottleneck down here what you really want to do is create back pressure so you get to your web tier at some point and put a splash screen up there and say don't put any more orders in because you're not going to get them so the key is to create back pressure the key is not to infinitely resize the queues isn't there work it doesn't take care of yeah right so yes so the the question is what happens to the publisher when all the event handlers haven't actually caught up yet it doesn't do anything get next in the case where yeah it's just going to look it's not going to put anything else in and hopefully what it should be doing is it should be saying further upstream so whatever service is feeding it don't give me anything more I'm blocked I can't do anything because actually I was I went to a session last week and it was talking about how I remember one of the messaging frameworks and it just infinitely lets you put messages in and then you blow up with out of memory and that's not very useful at all yes well well reminded how do you handle the situation when one event handler substantially slower than one of the others yes so let's say so we have exactly this in in will out in answer to your question I think no your question multiple publishers I'll get there in a minute no rubbish and I won't get there how do you handle yes so in real life you wouldn't do something as crazy as this particular workflow it's like it's just crazy crazy because if you want to be really fast what you want is each of these event handlers is working on its own individual thread if you want to be really fast each of those threads should be pinned to a core somewhere just just using that core keeping its cache nice and hot and all the rest of it if you've got like 27 stages or something in there then you're running on a dual-core laptop it's not going to get anywhere with that in real life the number of stages you'll have this like maybe two or three one of those stages for us is journaling to disk and the other stage is doing the business logic the business logic is very fast it's just maths you just just the process is just going to carry on turning to the instructions as quickly as possible journaling is extremely slow so it depends on why it's slow but you can shard those slow consumers so for example you can have one handling odd numbers and one handling even numbers their way in you can get nearly twice as much you can get it nearly twice as fast does to pay them a little bit because of course if they're writing to the same disk then there's a certain amount of contention there so it depends on on what the problem is but if you've got slow consumers you want to identify them and then figure out a way to split it into into more problem towards more consumers so is there some clever way to figure out how to write the disruptor does not contain any analytics and we monitoring any useful stuff like that one of the things that we do at Lmax is that we we inspect all those sequence numbers and we can use what's it called on brains totally by today but we can expect the secret timers expose it over jmx so you can see that one's taking over much much slower than the others we also do things like we do lots of we do lots of automated performance tests so we'll catch problems before they go into performance and those performance tests are done with production like load profiles so this is more of a performance testing type thing but the destructor itself doesn't say consumer a number 27 is really slow you ought to do something like short shard it with odds of evil numbers it's not that smart yes it doesn't well the sequence numbers not best it's done it's done when in this case in our crazy car case Oh hurry up and it's done when it gets to the last processor so everything up to number 27 is done yes the rainbow for no the rain buffer keeps track of the last ones only the last one doesn't care about the ones in the middle because doesn't matter usually I mean usually you probably wouldn't have multiple restaurants in in the real case if we go back to my one I said we weren't going to talk about which is more like the real world but okay so this is this is the L max architecture this is what we actually do with the disruptor we have our publisher as a receiver which pulls pull stuff off the network and just puts it straight raw into the ring buffer then we have replication which takes that message sends it off to a secondary server somewhere so that it's kind of safe or you know disaster recovery type of things okay we journal it's a disk because everything we do is in memory so we want to journal it to disk so if everything goes horribly horribly wrong we just replay the journals to get up in memory state back we can do those two things in parallel because replication and journaling don't they don't care about each other that's fine then we do the business logic but the business logic does care about journaling replication having happened for high availability and disaster recovery there's no point doing an order on something we haven't saved somewhere because if your server crashes you you're absolutely dead so we've actually just got a diamond dependency here we have replication and journaling - looking straight the ring buffer doing their thing and then we have business logic inspecting most sequence numbers and then that carries aren't doing its thing at that point so our business logic is actually also a publisher into a second disruptor so it publishes it onto another ring buffer and that has a what we call a publisher it puts it onto the network so then it puts that event back onto the network for some other service to pick up somewhere else so we want to wait still in the middle of answering another question which is niggling me about business logic oh yeah that's right and how do you make sure that you're not overwriting all the fields in your car with different event handlers don't do it don't keep writing back into your event it's actually it's it kind of works but it's kind of risky you could be overwriting yourself what we do what we do and we're actually doing our event handlers is we have them completely independent so one sending the event straight out to a secondary server and once journaling it's a disk they're not doing anything to that event they're not writing back into the ring buffer you can if you want to but that's not going to it that's not going to give you the most performance and it's not the safest thing to do yes I'm sorry I can't hear you yeah yeah I haven't drawn the sequence Paris I've drawn the lines so that the business logic is inspecting the other two sequence numbers this I mean this is an old diagram for another different presentation which is why there are sequence barriers aren't in there yes I'm sorry I can't hear you why do we need the second disruptor and it's it's nicer it's faster it's just faster to do it that way what happens then is that the the business logic isn't doing the expensive thing of writing stuff onto the network it's just going to quickly Chuck the new event off onto the ring buffer and then we can get nice batching again with the publishers it's separation of concerns so in this case the business logic shouldn't necessarily be doing all the stuff matching your orders doing all this stuff and then worrying about unmarshal marshaling it to put it on the wire that's not part of the business logic and the easiest way to do that is to just kind of chuck it into a ring buffer and then let something else take care of that because again this is i/o so this is a slower thing so it should be happening somewhere else yes yes yes yes yes so it was just around the point of the event handlers are always running in their own threads right so if you want it to be super super fast then you need to make sure that you only have as many threads as you have course so that you're not swapping things on and off but it depends a little bit I mean what I wanted to do with this example the reason why we the disruptors interesting is because it's extremely fast and and people use it in high-performance systems but what I think is quite interesting too is the fact that you can get this parallelization and you can have problems which don't need the speed which can still benefit from being able to run in parallel you get slightly simpler way of thinking about things so that worrying about multi-threading concurrency in synchronized Aurora what's all right nearly out of time and so I want to go over my caveats I'm going to go over my whole presentation all over again okay caveat yes and if you want to disruptor to go fast make sure that you have no more threads than you have caused in your system preferably slightly less you probably need a thread for GC or whatever so of course GC or whatever so if you want to go really really quickly then you don't want lots of event handlers that's just not that's not going to work for you um your ring buffer is going to be bigger than 12 it just as 12 is a very silly number so the point about the ring buffer is is a buffer it's supposed to deal with when you get a big flood of messages and then you're supposed to drain it right back down again so the size of the ring buffer is going to very much depend upon your memory you've got available and the sort of profile of your system do you get loads of messages inside the same millisecond or do you just get a nice slow steady stream if you're going to slows to a steady stream then your ring buffers probably gonna be quite small event handlers on separate threads mileage may vary yes right so this comes back to the point of obvious in this example that you write back into the car in the ring buffer that sums it it does work it is a pattern you can you it may not be the most performant thing you can do and there might be different ways to to architect your system so for example like I said your event handlers really want to be completely independent so write stuff over here go and do something over there go and put that over there but don't necessarily start writing stuff back into the same place is a bit fraught with danger there are some other bits and pieces we talked about the fact that event handlers will be waiting for the next available slot there are different wait strategies depending on various criteria so you can plug them in and see how they go you just test them find out which is fastest for you in the same way that you can read stuff in batches you can actually publish stuff in batches if you know bearing in mind that you know the last event handlers number and it's it's it I don't know you know that the last event handlers process that's one that I'm up to number 27 and you've got like a boatload more slots available you can basically say to the publisher you've got all of those slots to write to that's easier with the drawing and I don't have time to do that right now but basically you can do batch publishing as well as batch reading multiple publishers is a source of contention but you can have multiple publishers there are different types of event handlers which may not be around in then in the future that's it's worth poking around and seeing what's available there's a wizard to help you chain everything up and build up your dependency graph and actually you don't even need the ring buffer there are you can write your own data structures as long as they implement a certain interface and behave in a particular way and you can plug in your own data structure on top of this you get so if you're using the destructor what you get is you get a framework which encourages you to model your domain so you really have to look at what your actual dependencies really are what can really happen in parallel and what doesn't need what needs to wait for what else so instead of shipping it off to some framework and letting that framework magically figure out what what's going what's a good way to paralyze this way you can actually look at your domain model the only you know your domain model and really split it into the things which really can run in parallel and which thing can't run in parallel the ability to run in parallel the single-threaded you get a really nice simple java because each of those event handlers is running single threaded you don't have to worry about anything around synchronized and what it's read code or contention it's just really nice simple plain old java it's nice like that and you get reliable ordering because everything is stamped with the sequence number so you just know which order your app and it gives you nice ways for being able to I talked about journaling and the fact that you we journal everything to disk and we can replay that because everything's got a sequence number it's quite nice for that and and on top of that it can be extremely fast and is very fast but that's not the only thing that's cool about the destructor more information is available at it has recently moved to github it was on Google code it's now on github and there's a wiki on there and a bunch of links and the white paper all sorts of stuff there's a bunch of blogs about it my stuff kinda talks a little bit more about how to use it and mike and martin who really helped create the disrupter they've written some really cool stuff around why it's fast and all the sort of underlying stuff about how they manage to get rid of the volatile and things like that so if you've interested in the deep core high-speed techy stuff then and their blog to the wants to read there's a bunch of presentations online about it and the best place to go if you've got any questions is to go to the google group because there's loads of really small people on there who will answer your question plus loads of the questions have been asked before so it's a real source of information coming soon to stop 23.0 at some point where everything will change that's it thank you
Info
Channel: Oracle Learning
Views: 12,897
Rating: 4.9333334 out of 5
Keywords: java, concurrency, disruptor
Id: eTeWxZvlCZ8
Channel Id: undefined
Length: 60min 13sec (3613 seconds)
Published: Fri Jan 25 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.