Core async Communicating Sequential Processes using Channels, in Clojure - Rich Hickey

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'd like to thank everybody for coming I'm going to talk about closure core async which is an an implementation of CSP style channels for closure and along the way actually just talk more about channels and their use for solving these kinds of problems and contrast them with some other solutions so the first thing is to enumerate what the problems are that we're trying to address whenever we do anything and in this case I think there are two which may not make sense from looking at the slide as it stands but hopefully we'll as we go forward and the first is that function call chains make for poor machines and recently and during the talk earlier I tried to contrast the part of your program that's a machine from the part of your program that's information and certainly there's a there's a part of most programs that need to convey stuff around which is certainly a mechanical like activity and we end up with these callback api is often to do this stuff and that is a chain of calls that ends up doing the job of conveying things around quite poorly but the other problem we have is a lot of real-world api's expose the end point to something like i/o via callback api so it's it's something we encounter and have to address so the premises here is that our that there comes a time and all good programs when we need to keep things separate and we need to isolate things beyond the kinds of isolation we can do with calling and in fact we want to isolate ourselves from call sequences that you're going to start using queues inside your architectures to that the producer of some information and the consumer of some information know nothing about each other like completely nothing about each other and all the encapsulating ways that we have that involved chains of function calls fail to do that at some point and you can tell they fail to do it because you won't be able to move things to other places or you'll have some build dependency that's where the stuff will appear so we want to do is raise up conveyance the part of your application that says I have something over here it's going to produce some information and that should be the end of what it knows it's going to put it on the end of a conveyor belt and someone else is going to come and pick that stuff up and do the next thing with it and never the twain shall meet we want that to be a first-class thing and so I think what happens in most architectures whether it's in process or across processes that you start introducing cues because cues are a representation of this stuff so we're gonna have processes we're gonna have two cues or channels now if you're using Java you know we already have cues right this Java util concurrent cues and but there are a couple of problems with using them and practice one of which is that they coordinate via thread control in other words they block actual threads which means that you have to Park a real thread on the end of a queue in order to utilize it another problem we have is if we want to try this to apply some solution both on the JVM and in the browser and places where JavaScript runs we don't even have threads there so we need something other than this now of course if threads were free mmm that problem on the JVM wouldn't necessarily be a problem but they're not free they certainly have stack size associated with them which can become a problem when you have a lot of threads which is an efficiency problem for an individual server and there also are costs involved in waking up threads and and getting them to execute work on your behalf so that's why we can't afford to use threads often so queues are good job util concurrent queues don't always apply and there are specialized queues that do a much better job for very particular kinds of scenarios Martin will tell you all about so if we look at the situation we're in we're facing events and callbacks right and what do we see when we address the events in callbacks we often see something like listenable futures or promises and what happens is you you know chain these things together and you end up with a a web of direct connected relationships and it's very difficult to reason about control and flow in a program where all of your logic has been separated into little witness just click do this and when this message comes in do that and when this does do that and the big picture about what your program is about is broken up into splinters and situated in all these callback handlers so the logic is fragmented and the vernacular term for that is call back hell write anybody who's worked with the system has a lot of callbacks knows this is a big problem there are also opposite compositional problems associated with callback handlers and something like Rx and observables do help with that they do help you for instance build transformation pipelines that are connected to callbacks but they don't do everything so there's a lot of problems with this kind of chain of callback handlers and no matter what kind of wrapping you put around it the visibility of which handlers are in play monitoring that control what's going to run on what thread right so I'm calling you back as oh no my thread is it on your thread you know how many people have ever done something with callbacks and there was an admonition I'm gonna call you back but don't do too much work don't do too much work in the handling okay that's a sign of a problem so one of the problems with this is that we're starting to when you do this when you build a set of call chains connected together you're using that call chain as if it were machine handing off one function calls another function calls another function calls another function right and they're passing it along and what invariably happens is because your logic is fragmented right it's in two pieces one is associated with one callback handler one is associated with another and especially if there's conveyance that is to say when you're told about something happening you have to tell somebody else about something right so you're shoveling something through maybe with the transformation in the middle as soon as you fragmented your logic if you have any state at all like am i interested in this message can I accept it right now should I send who should I send it to who's a list of people who care right if there's any state associated with the decision-making process in your split apart logic you have to put it somewhere and use shared state to do it writes this handler goes and says look in the shared thingy and make a decision local for this handler or another handler says look in the shared thing and put something there and then go back so you're buying and you're forced into shared state when you do this and now of course people say oh well you know we have objects to encapsulate this but object you know it doesn't do anything it doesn't actually change anything about this right it just puts a blue oval around it that's all the object does nothing about the scenario is is any different right if you actually have multiple threads of control running through an object that object is really not in control it's not really you know keeping track of everything all that shared state stuff is still on you you know objects are like sort of marionettes where anybody can pull the strings at any time right so that doesn't usually work out that well so they're a bunch of techniques that have been used to sort of reinvent roll if you will right so the controller you would like to have said is if there's something interesting happening do this do that see if there's something else interesting happening bla bla bla around or around possibly with different sources of input we'd love to just go back to writing a blocking program because it looks nice and it's easy to understand and it co locates all the logic the problem is we can't effectively do that if we have callback api's and or this thread thing so c-sharp and an f-sharp both had some enhancements made at the language level that attempts to reinvent the control right all they do is take code that looks linear and rewrite it to be callback code but you don't see that and so what you see looks very straightforward you say go do this thing in an async block that takes you know an arbitrary amount of time then do this then do that and what ends up happening is your thread doesn't get blocked there it looks like it does but it doesn't it gets relinquished and if the thing that you were interested in completes then you continue so I saw a talk from the scala guys who had copied c-sharp a sink for scala and i said that looks cool we have macros we should do that it's probably a weekend project and it would have looked something like this right so you have your original kind of code where you say ordinary future that's it like a Java future the blocks go do something useful try to dear if the future at that point your thread is tied up right and then eventually the future completes then you keep going and the middle code is what you would do if you had callback handlers you'd say right uncomplete to something with the result and that do something with the result is the fragmented code that you know if I had to share logic with other handlers and or state they would become messy finally you go back with something that mimicked c-sharp async to something where you say in this eight block you know treat all calls to blocking or asynchronous things as if they were blocking but actually invert control and you say do something useful and then you say await this future and what happens is that the calling code gets turned into a state machine and parked on a callback handler which will resume that state machine in a thread pool thread whenever the interesting thing happens and effectively your thread is back in the pool for to do something else productive so that's a that's a very nice thing and and it has a lot of utility but it's it's kind of just a subset of the problem that you want to address right it's sort of just sugar because all that it really addresses is RPC style communication right promises and futures sort of a single-shot relationship between two parts of a system in an architecture right go do this use your answer or here's a here's a promise and when it gets fulfilled this the one thing that'll ever send to you is in that promise it's just hand off so it's hard to use to model enduring relationships and it's hard to use for external events because you know again a futurists are like I give you this for an answer but it an X or a little bit it's like a stream it's continuing continuing to pass you stuff so what we want is we want this sugar sugar was good we just want to put it on a better cake so again as I said earlier in the talk the answer here the thing that you'd like to have the programming model that you'd like to have is queues queues fully decoupled producers and consumers right if somebody put something on into this conveyor belt who's gonna pick it up they have no idea if you're picking stuff off a conveyor belt who put it on there you have no idea right I don't know I don't want to know I used to say that so often in a course I taught that one of the students made a shirt for me right it's a good thing from an architectural standpoint that's a good thing because it means you have independent decision-making so they're also a first class they're enduring right you can use them to model enduring relationships here's a channel or a cue put stuff on it you know all day long all week long you can make them so that they're independently monitor Bowl and you can make them so that you can have multiple readers or multiple writers so what's beautiful about a cue is it does this job and it doesn't do any other job right it's not like an actor where the logic of handling is connected to a cue and you get this one thing that's both a mailbox and a handler ooh because two things it's it it's not as good as one thing one thing is better than two things so we like this and in fact the logic and and way of thinking this way of thinking about programs is actually old Tony Hoare wrote this communicating sequential processes paper which is not exactly like what it's become but it's the basis for this way of thinking the idea is simply you have multiple processes and I'm not talking about operating system processes here I'm just talking about some piece of logic that's going to run independently of another piece of logic whether that's truly asynchronously or you're just using you know time slicing cooperative stuff like the JavaScript engine does doesn't actually matter because this is a pattern for organizing your program not necessarily it doesn't necessarily dictate a way of realizing it channels are first-class at least this is what CSP has become over the years channels are first-class so you can pass them around you can pass them as an argument you can hand somebody a channel and they can say okay I'll hang on to this and and I put something on it or read something from it later by default the semantics are blocking and in particular for CSP style channels the baseline semantics is it's a completely unbuffered Channel that is to say it's a synchronization point it's a it's a handoff point as one thread is gonna come in with something that they're writing will not go back until another thread comes and consumes it or vice versa and when I say thread I mean thread like this so you can actually use them for coordination with that semantics you can build coordination primitives on top of it and a lot of the CSP literature is based around doing that but as soon as you introduce buffering then you get real asynchrony all right so we can put something on it's putting a buffer they go and proceed somebody else can take it off and there's a long history of this akhom was one of the first languages to sort of make this a first-class part of how it worked does Java CSP which is a library approach to doing this and then of course go is the most recent language the sort of took this as the first class this is how this is how this kind of programming should work and I agree with them and their choice I think it is a good way to do this kind of kind of thing so there's a lot of nice things that also have come to grow around this notion of channels the first is that multiple readers and writers can be supported so that you don't have any binding you can add more readers to support work distribution you can have multiple writers so you can have separate authorship writers and readers can come and go like no one sort of bound up to the queue you can pass the endpoints around that's part of what I mean by first-class and then the other critical feature which is quite nice is there's always a construct called select or alt which is allows you to wait on more one or more IO operations so you can you can select or alt alternate on more than one channel like waiting for something to arrive on more than one channel or waiting for a right to complete and a read or a read to complete or timeout operations and this is huge right obviously in soccer programming we do this kind of stuff all the time on the JVM the queues and the thread stuff doesn't have anything like this on net and actually our windows they have long had a a long wait multiple writer who remembers what that's called they have a multi wait so a multi wait is a very nice thing as an organizational contractors you can again put a single piece of logic that says if any one of these things happens I'm going to proceed and deal with that and then and then go back and there's also a set of formalisms and and algebra is around doing analysis of programs constructed this way so you can prove that there are free of deadlocks and things like that there's none of that support built into quarry sink at the moment so there or there are already implementations on the JVM Java CSP would what would be one and communicating Scylla objects was another but both of these are tied to actual threads so they don't overcome some of the thread limitations before they allow this model of programming the shape of programming but they they would have difficulty using your machine efficiently so the challenge the idea behind this library is to try to create a channels a CSP style channels library for both closure and closure scripts as something that works in both places where closure runs where you can use the same calls on both platforms where you could with similar calls on the JVM get actual blocking because sometimes we're real threads and actual blocking or the most efficient thing that you can do or you can get this macro generated inversion of controls so it's like what the c-sharp compiler was doing where we have a set of macros and closure that will take your code and invert controls to take code that looks like it's saying read any one of these things and wait until it happens and turn that into make me a state machine and associate it with callback handlers on all these things and relinquish the thread and if any of those things happen one and only one of those things will be seen to have happen by that logic and the logic will be re-established on a call on a thread pool thread and will continue to run so it's beautiful you write code it looks like it's blocking and you get code that's actually doing all the callback work for you so this is a big deal if you can do it right because you can still write traditional threaded apps this way you can get higher connection counts on your JVM servers if you switch to the inversion of control system you can even work on invents servers and in the big the big kahuna for the clojurescript guys and people in that space is to fix the callback hell problem in the browser there are other ideas for using these kinds of channels on a network it's difficult actually to convey all the semantics of channels over a network because of the failure modes and core async does not currently contain any network channel so we're strictly talking about in process inter process communication one of the smaller processes are just pieces of logic that have independent lifetimes so one of the cool things about quarry sink enclosure is that it's just a library it didn't require any modifications to the language right you can do this with just macros they take your code they rewrite your code that's what macros do so this is a job for macros didn't need to touch closure to do this what you get are independent threads of activity will call them threads but they're well they're Co alignment with threads is weak and you get channels that behave like queues and it supports both close around the JVM and closure script so it looks like this you say thread with the body and that allocates a real thread and all the blocking calls in that are real blocking calls or you say go body and you get this inversion of control thread that uses a state machine in parking and thread pools to do the job you have channels again their queue like their multi reader multi writer they're fundamentally blocking their unbuffered by default or you can have fixed buffers there's no indefinitely sized or arbitrary buffers in quarries think we're not going to provide that because it's a recipe for a buggy program so you may have to tune your program and analyze it and see what's going on but the net result of that is that you can write real programs that have genuine back pressure which is a great thing as an architectural construct when you don't have it you're always struggling in its absence the API is pretty straightforward to create a channel they're calling Chan or you can say Chan 10 which again gives you a fixed size buffer or you can create some buffers and what's nice about buffers well I'll talk about that in a second or you can create an explicit buffer pass that to a channel then there are two fundamental constructs put and take there'll be a parking version and a blocking version the parking version is one bang the parking the blocking version is two bangs I'm like really going to talk too much about the blocking version because that's not supported on JavaScript so the the portable code that you can write uses go and the single bang versions of put and take so you put a value on a channel and you take a value off a channel you can close a channel if you are writing a JVM program you can mix mode so a single channel can be consumed with both truly blocking code and if this go code and it can also be produced without a flavor of code so you can mix the modes which is very nice again because at the edge of all these things you usually have to revert to the code that didn't know about channels so how do you get there buffers by default are there there are none right so it's unbuffered by default which is just strictly a rendezvous a fixed buffer will block when it's full but the the other cool thing is that you can really incorporate policy into buffers because you could hand a buffer to a channel we have a couple of flavors of buffer that implement policies that would be common right for instance the sliding window buffer says if the buffer is nominally full at when I put something new on it get rid of the oldest thing that's on the front of it which is quite commonly what you exactly what you want to do of course the other flip side of that is when it's full every new thing that comes in you drop on the floor so these are the policies you take in a program where you're not going to say I'll just pretend this unbounded buffer is a good idea and see what happens in production where you have forced to make decisions well there you go you make the decision and you incorporate it in the policy that's in your buffer because we think on banner buffers are bad then we have this choice construct we chose alt for that so sledge allows you to wait for multiple operations so you can block on multiple puts and takes the fundamental construct underneath Altis is a function called alts which takes a set of operation represented as data and will wait on any one of those the critical thing here is that when all returns one and only one of the things that you were waiting for has happened those you've taken one thing off of one channel that you were trying to read from or you've succeeded in putting something but you haven't read anything so you know exactly one thing and this is a Tomic across all participants if more than one thing is ready you'll get it a random choice made or you can set priority which would mean if more than one thing is ready the thing with the highest priority it's the thing that happens but one thing happens and then alt so alts is a function that implements the work and alt is just a macro on top of it that allows you to write code that that that works like this so I'm not going to get too much into the code but this says try to read from C or T call the result Val and the channel that actually succeeded CH and then do something with that in function foo now this says wait for read on X and passed it to a function call and call it V and do the work of V when you pass a pair you're saying I want to output a value on a particular Channel and so whatever whatever operation happened the thing on the right is the result of the expression so these are the operations this is the binding part that's what happens if that alternative is chosen so like go we use channels to represent timeouts that ends up being very powerful and quite elegant you create one vise for saying timeout and certain milk number of milliseconds and what it does is just returns a channel that closes after that amount of time but what's cool about that is it turns a timeout which is usually an argument to every API call you make into a first-class thing that you can for instance reuse across a whole set of calls in other words do this for five minutes you can say make one timeout five minutes from now and put it in the alt of every operation you do and after five minutes have come back that thing will complete you had didn't make a gazillion calls all of which had five minutes well now it's five minutes less three seconds I mean who has done that with timeout code it's just not fun so this is quite clean and you can include the timeout just an ordinary alt you try to take from it and it will return when it closes and that allows you to share timeouts between operations which is also powerful and encapsulate the actual timeout value and the way it's expressed so if you're familiar with go you'll see that this has a lot of similarities to go and of course the other things that have been built with CSP over the years some of the differences are that all of the operations are expressions right this is closure it's a functional language we don't do statements so everything is an expression it's a library it's not a it's not a language feature so it didn't require the language to be built around it because there are trade-offs with that I mean hopefully go is going to be able to do what they do quite efficiently because they're oriented around doing it and in a library you're gonna make some trade-offs alts as I said before I showed you the macro but it's built on top of an actual function that's quite powerful that allows you to write code that arbitrarily at runtime waits on an arbitrary number of things like you read a configuration file it says go try to read these seven things if you have a language that's built this into statements there's no way to make a statement that has an arbitrary number of branches in it so it's nice to have it be a first-class function and we support priority so at the edges of your program you're gonna be facing callbacks anyway so is this just a waste of time it's like this is great rich but like I have this pile of things that all pass me futures and listenable futures and promises and where I'm in JavaScript land and everything is the callback it you know is this is this a lost cause and the answer is no it's really easy to bridge to that code because in your handlers all you need to do is take the thing that they gave you and immediately put it on a channel just stick it on challenge that point you've inverted control you said okay call back or we're done now it's in the channel system and everything else is going to be flipped around right-side up if you will so you you just put the values you encounter right into a channel and those put and take you'll see this uses the words they need not be in go blocks all right so that's your entry point to channels from code that's not otherwise in the code that's inverted because this code isn't inverting it's just supplying a value to a channel similarly in JavaScript especially you're going to need to get out of channel land right because there aren't real threads and eventually going to need somebody to say okay well do this you know effect this widget or something and so you're going to need to revert or rien vert control on the edges of a JavaScript program and you can use take in a similar way so take can be executed in code that's not had this inversion of control outside of a go block in particular so the combination of these things means that you can you can deal with the browser alright the browser is a place that's all callbacks all the time that's all they have it's built it's oriented around this and it ends up being the case that you know friends don't let friends pelagic and handlers right this is this is where the hell comes in this is how you get help so if you do what I just said you can avoid this hell because you don't have any logic in your handlers and your logic becomes all back in the same place so when you use closure script and Cori sink you get the separation of logic between events and and view and it's a very big deal I mean I don't know if anybody's read David Nolan's posts and whatnot but you completely change the kind of code you can write in the browser you can take things that were nasty complete messes even written by expert JavaScript programmers and turn them into things that are you know 1/5 the size where the the event handling code is here and updating code is there and the logic is there and it's it couldn't be possibly be cleaner so it fundamentally changes what you do and and and we were just having a conversation before I came up here and I think the question is you know if you had both would you ever choose callbacks and answers absolutely not all right there's all kinds of ways to fix callbacks and make them slightly better you would never pick that if you had a choice so the reason why you don't have a choice is because not every language a was either oriented towards this or has the ability to morph itself to work this way even sometimes but when you do you wouldn't do this so once you have channels what does your model look like well the first thing is is it logic gets put back together you have your logic all in one place no matter how many different kinds of input sources or places you might want to redirect stuff right because this is a this is about conveyance no matter where you're getting stuff from or sending it to your logic can all be in one place right because you can alternate all of your reads of all your sources together and you can alternate your rights or you can all straight the whole set of things that you know for instance that you're never doing more than one thing at a time maybe you have a very complex state machine we're incredibly difficult to coordinate in nineteen callback handlers but you put them all on the same alt you know absolutely you're not doing more than one thing at a time and it's super clean to write so your code looks like this so I would like to try to contrast the two things here because I think you know you'll see talks about our X and whatever and this like talk about duals and it's all like ooh duals are the same right they look the same duals are not the same dual means opposite has the same shape and the opposite meaning right the same transformations work on both but the the semantics are the opposite so what happens when we try to contrast direct calling right switches chains of function calls in the callback model with an indirect system that puts channels in the middle and you'll see everything is opposite right your logic in the first case is split up into separate handlers your logic is together when you when you use channels right your calls are synchronous unless you put in some extra stuff right I'm gonna call you are gonna call them are gonna call you're gonna call are gonna your call are gonna call bloom that's all I'm gonna happen them you don't have any real ability to spread that out unless you superimpose something extra whereas with what channels it's inherently async right you can choose a policy that synchronizes or you can choose a policy that doesn't but function calls call functions call functions you can't just magically snip that in the middle you have a one to one relationship between the providers and the callers can you make broadcasters but it's still I'm calling whoever's gonna get called so like I'm in charge of doing that with channels you can easily get multiple producers and multiple consumers right you have this implicit relationship between a callback handler and the thing it ends up calling right you can put all the programming and direction you want and I encapsulated it in an object and whatever but the bottom line is that is going to call you and here you have an explicit separation of concerns which also means that you can do explicit orchestration right I have somebody who's interested in consuming something I have somebody's producing something I have channels they're all independent and I can make a third party in charge of doing all that work whereas with callbacks is very difficult to do because you have to get inside the installation of things the shared state as I talked about before is an internal thing and whatever shared state there is because there's always some state associated with the channel or a queue right what who can get at the head right now and that kind of thing is external in any case it's reified outside right that shared state you got to come up with your own strategy for making sure you know your different handlers don't trance on each other the other thing that's interesting is that I think the state that you get with callback handlers is inherently a place state right so one hand was gonna say there's a new user let me put them here and another handle is gonna say let me go look there and see what was put there by those other handlers so this inherently place oriented notion - that the analogy I would make is you go to the you go to you go to work at your factory right and you have your jacket right places like I put my coat on this coat hook and what's your expectation you can go back later and find your coat on that coat unless somebody else said well we're out of coat hooks I'm gonna take your coat off and put mine on it you get these collisions whereas with channels I get something that's a subset of State right yeah things are changing it's obvious right this is moving conveyor belt the some state here but its flow state right if you came into your factory and you could took your coat off and you put it on the end of a conveyor belt what's your expectation be you're never going to see that code again right you don't build programs with flow state that expect to go and revisit state and therefore they're a lot less complex right there's still state there's still things in motion here there's still two machines but flow machines are less complex than place than places so I think that's a big win the other thing is when you do callback handlers that shared state is your problem right making the channels do the right thing is a library problem right it's just channel authors problem to make the flow state work it's not your problem the logic in a callback handler is passive right when do you get called back whenever you get called back you're not in charge right your passive when is your logic run whenever maybe I had a conversation with the guy who's gonna be calling me but maybe not when does your code run in a program that consumes channels whenever you want because you don't have to read those channels you could be doing something else you could say when I'm in the state I don't look at those channels therefore I don't hear from them right how many people have built our you know large architectures of callbacks and then been like well I wish I could turn off these three when this is happening that's hard right it's very hard so you get that you have the choice right in your logic the other thing is that this implicit communication is code driven right and the explicit communication is data driven the thing that's flowing over these channels is is data which means it's straightforward to go and for instance put on a wire or do something get a real true separation of concerns we saw this in the design of pedestal which was a piece of logic for the browser just a library for closure that in its original incarnation basically takes inputs in transforms a data model that can detect Delta so you can efficiently determine when this change came in these three parts of this data model changed and therefore these parts of the UI should change and because that system was architected with cues on both ends of that thing they were able to say you know what it would be nice if we could run all this transformation logic in a web worker and they just took that code and they put it in a web worker they took these two channels and they marshal and they marshaled right when you have webs of calls you can't do that kind of work because you're your fundamental communication is not data its calling and you can't just take call you know call chains and split them across web workers right you can't even call across web workers so as soon as you can get to data you should you should have a lot more flexibility in your system when you do that so I would say that there's a sense in which this callback thing is intimacy right everybody knows by really building this whole intimate system with a lot of connectedness and and there's a sense in which a channel driven system is ignorance right I don't know I don't want to know I put stuff there and I'm done I take stuff from there I don't care where it came from right and we all know the ignorance is bliss and in this case intimacy is pain not necessarily generally but certainly in this case I think it is this is just another taste of what it looks like this is an example from goes examples of how you would for instance set off a bunch of queries that try to reach multiple possible sources for each of an image web query and a video query and returns whichever the first one of those came back with an answer for each of those types but bounded the entire thing what for you know with 80 millisecond timeout and that's what it looks like here it's just it's just like the go code but it's a it's just as expressive except this is all expressions and not statements and this is a really powerful and simple way to think about your programs if you're writing concurrent programs because the semantics are very straightforward and you can their semantics you can get your head around and make decisions based around it's not like this nebulous set of conventions that you're forced into with other solutions so what do you get when you do this you get a separation of concerns for realz separation of concerns you end up with logic that's quite coherent and linear it's co-located right you end up with logic that if it has state it might be able to just use recursion to maintain that state and not need any mutation constructs or any kind of coordination constructs versus the shared state which would require place oriented state you can get coordination out of it if you want you can run on buffer channels and use them as synchrony points and rendevouz you can get back pressure because you're gonna put in a fixed buffer which means you can get to a point get the back pressure and then cascade that so you can build very large systems that have reliable and easy to reason about back pressure characteristics you can't make them dynamically configurable again because the channels are first-class you can assemble a network that makes sense given the topology you're encountering at runtime and they're efficient so I'd like to just thank the people that helped work on it especially Timothy Baldrige did all the icky part of the macro that inverts the control which is quite gross and if you want to try it it's here so the code is here and whatnot there's a bunch of other things in there now there are nice constructs for doing merging and mixing and pub/sub and kind of higher-level things I'm certainly I don't anticipate people I would hope people would not need to work at the bottom in most cases and I'd also encourage you to make sure that you reserve this code for true conveyance scenarios and not just to write goofy parallelism stuff because it's not actually well-suited for that at all but but there are a lot of higher-level constructs and we hope to have more of them including pedestal based around this kind of work and so that's all I have to say and I can take some questions probably so the question is how would you how would you extend this to distributed systems with real real Q's and the answer is like I said earlier on the talk I think that's still somewhat of an open question you can't necessarily get all the semantics that I just described in a distributed queue because some of the failure modes are different on the other hand what most of the people who have tried doing it have done is just subset the semantics so you still have these two semantics and they still work the same way and I think that's a reasonable approach to take so for instance you might have constraints around whether or not buffers could be effectively blocking you might always have to install a policy for instance like like the sliding window or the dropping buffer sometimes some of the solutions like the Java CSP solution has some networking constructs that require for instance the consuming end of a of a channel to to host it so in that case it wouldn't be as first-class right it wouldn't be a channel like a cue system that's sort of independent of any process that runs you would have the endpoint connected I don't love that because I think it starts to smell like actors at that point and you lose that sort of first class of the channel is what it is people come and participate but it's something we're actively looking at right now I do know that I don't think all the semantics can be conveyed I mean I think that so the question is have we contrasted between CSP and PI calculus which is more recent work and more more involved and the answer is definitely not yet I mean I'm not sure that PI calculus has moved to the point where I would consider it sort of closer to something I would use in actual programs yet as opposed to more of a theoretical underpinning there's plenty of great ideas there but again you you know you have this challenge right are you gonna write a new language that works that way or what can you bring to an existing language so this is particularly interesting because it's a library I'm go probably had that question more readily available for them you know you're writing a new language why didn't you use PI calculus so my excuse is of course it's a library but I do think that there are interesting things there and and they we should look at them so the question is there's using channels on cerise data centers using channels increase the complexity from a versioning perspective between producers and consumers and I would say probably not it probably does the opposite because it's easier to agree on a data representation and and and migrate the code than it is to agree on data encode or code and calling signatures and data I mean this it's always going to be data end and end and so what this does is takes it just down to data they're the the contract is data contract so I think it's it's more tolerance of versioning independence on both end because it's more independent so the question was how does this timeout policy work and so a timeout good call to timeout creates a channel like any other that you will attempt to read from and after the timeout has occurred that the channel will close which will cause your reads a complete read on a closed channel returns immediately so down at the very bottom is the code that actually tries to read it says alternate try to read what's happening is all of these jobs are sent off asynchronously and told to put their results on the same channel see so this code down here tries to read any of those results and the timeout Channel so this will return when when any of those things produced as a result on see at the bottom there oh you can't see my cursor so I'm wiggling it over to see at the bottom I'm sorry that alt call at the very bottom says says try to read either of these things C or T and it will return when either something is available on channel C or T closes because that's the only thing it's going to happen on time out channel so what's cool about that is that's in the middle of a loop that loop just keeps going and going and the single timeout is governing the operation of the entire Loup as opposed to having to come up with a timeout per invocation of read for instance so I think this stuff is extremely cool everywhere systems that have done this kind of work have touched this in order to find an alternative the code has become dramatically simpler really dramatically simpler than the word dramatic should be reserved for this kind of thing it's dramatic so I definitely believe in it there's all kinds of things that you can do to try to improve performance and things like that but as an architectural construct I think it's it's quite quite appealing so I think with that well we have one more question to forget okay the question is as many work to get them working across processes a little bit like the other question and yeah people are working on it I'm mostly concerned that they don't do something that has the same surface and different semantics so mostly I've just told people no no no no no no because that I think would be a catastrophe you don't want something that looks the same and behaves differently so like I said before I think that there are there will be limitations to the semantics you can convey over a wire I'm definitely interesting in having that in lieu of that though there's no problem saying I'm going to continue to use my favorite cueing system and on its endpoints which have got callbacks I'll do exactly what I advocated before then you're combining semantics you're saying you're going to convey something with a you know a third party cue across the wire and then you're going to turn that into channels for the application code so you don't necessarily have the channel behavior for instance you might not get back pressure across the wire that way but both of these guys will feel as if they're reading a channel that's got a policy on it or writing to one so so you can combine the two right you can use this with all the I off stuff you have already you can use this with any Q's distribute accused you have already and just turn their API endpoints into reads or writes of channels and then use channels from there on actually making a distributed channel that said I have the CSP semantics maybe a research problem but I don't I don't think it's completely possible given you know TCP and other realities we you want to address if you really want to be something use in the real world as opposed to a theory all right well thanks enjoy your lunch [Applause]
Info
Channel: Zhang Jian
Views: 1,533
Rating: 5 out of 5
Keywords: Clojure, Programming, Programming Language, Java
Id: 9HspeHGBg-Q
Channel Id: undefined
Length: 46min 18sec (2778 seconds)
Published: Mon Jul 29 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.