Reactive Relational Database Connectivity

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
we're here today to talk about reactive database connectivity or our TV my name is Ben Hale my official job title is the lead of the Cloud Foundry Java experience and then I see a bunch of people like well I didn't come to a Cloud Foundry talk that's very weird to me that is only my day job however I have been a member of the spring core team since 2006 or so and my position as the lead of the cloud project Java experience basically my responsibility is to ensure that applications running on Cloud Foundry written in Java are the best they can possibly be originally my mandate was to make Cloud Foundry 2 plus the best cloud to run Java and lately it has more evolved into make Cloud Foundry the best place to run Java not just the cloud and so in my position working on things like the Java build pack and the Cloud Foundry Java client and things like that one of the things that I was very sort of I had a first-hand perspective on was the rise of reactive programming in in our Java applications that are running in the cloud and so this led me to sort of get involved in project reactor it's been a very good day for me obviously you guys saw in the keynote this morning reactive relational database connectivity I also spearhead the efforts for the company working with our socket which was announced at the end of it so it's been a pretty good day for projects that I'm interested in and especially reactive projects and I'm interested in so I personally believe reactive programming is the next frontier and Java for high-efficiency applications right we can sit here and say that reactive programming is a bit more difficult certainly today than traditional imperative Java programming however there are circumstances where sort of the one thread per connection model just absolutely breaks down it cannot scale to the kinds of loads and the kinds of workloads that we need in our applications and to date the only real alternative that we have in the Java community is to go with reactive programming and so just to set some a level set everybody here actually how many people program reactively today we're using all racks are you using akka or using project react or anything like that how many people okay a small number so I will try and ensure that I explain a lot of the reactive terms and things like that as I go along if not feel free to raise a hand interrupt me and I'll see if I can clarify for you as we go so reactive programming is fundamentally about writing non-blocking code so if you are familiar with programming and nodejs or something like this the fact that there is one thread or in the java cases as Rossum mentioned the keynote today typically it's the same number of threads as you have processor cores it just means non-blocking right we'll talk a little bit about it back pressure in a second but one of the key things is it actually means almost nothing about a asynchronous computing it is paired with it but reactive programming doesn't necessarily imply an asynchronous kind of behavior most flows if your reactive programmer already today using something like project reactor or 100% completely synchronous because it is the most efficient way to actually run an application given you have access to the core reactive programming talks about what happens when you need to wait on something right you don't want to block for that you want to get off of that core and let someone else make use of it in a highly efficient way before you jump on it but one of the key differentiators and this is going to come all the way back at the very last slide here is that async is not reactive you've been here a lot of people tell you that oh yeah my thing is async right I don't block on a thread or anything like that but that's not what reactive is the key differentiator is it reactive is all about what we call reactive streams backpressure reactive backpressure you'll often see it written as pull pushback pressure and the key kernel of this idea is that when a publisher needs to send data to a subscribers basically a producer to a consumer it is not allowed to send data and therefore it doesn't need to even materialize the data that it might potentially send until the consumer is ready and has asked for it so in the reactor streams way it sends basically a request of n I want you to send me the next 8 items I want you to send me the next 32 items I want you to send me the next 10 24 items and then the producer is able to produce exactly what it needs at that time to fulfill that request it doesn't have to send all of them it could send ten and then wait a couple of minutes and then send another ten and the wait another couple minutes or if it's got you know ten thousand it only gets to send the first thousand and we'll eventually then the client when it has processed them and it feels it has the the availability will then and go and request an additional ten or an additional number of items that it can handle and so we've been talking in spring one for the last three years actually three years ago we got on stage and we started talking about this reactive thing in project reactor like we've got Java eight almost everybody's on it now we've got lambdas which make a nice programming model around this we can start doing reactive and then last year Spring five-o came out and it's all reactive right not a replacement for the imperative stuff but we've got Webb flux we've got web client we've got reactor core becoming a core dependency only the second mandatory core dependency ever in spring frameworks history and I've been around basically since the beginning it's it's an amazing thing that such a thing can happen but it tells you how much we think reactive program is going to be important going forward and so if we take a look this is just your sort of standard web client thing right we've got an annotated controller it returns a reactive type you know sort of lazily initialized health and it does this by doing a composite zipping together calling two other services so it can represent itself as a single Authority there that's well well and good hopefully some of you have started experimenting with it even if you're not doing it in your day job but there's still some barriers to using reactive everywhere and I put a little star there because there are certain true believers you might call us that really do you think that eventually it's all gonna be reactive you don't have to think like that and you shouldn't think like that as sort of that is not an official position of anybody in the spring team that is my own personal opinion but when we say reactive everywhere what we really mean is everywhere that it's appropriate so today there are these high-efficiency use cases that we're talking about earlier those are really obvious places that you want to use reactive programming but that absolutely does not mean you can use reactive programming today so cross process back pressure is one of the things we don't have except as a about 30 minutes ago we do have it our socket and I encourage you to start take a look at the one that's happening right now if you want to rip over to the VR socket one and also I'll be doing a talk later this afternoon about our socket as well and we also don't have very good data access right going back to data stores there are certainly a number of reactive especially no sequel data stores things like Cassandra Redis things like that as we're just shown in the the previous session from here but to date we haven't really had access to the number one data store buy a farm margin that all of you are using relational databases right this has been an absolute stop if you have an application that would immediately benefit from using reactive programming your hosts you don't get to do it if you're trying to connect to a database via JDBC there's no way to get good stuff there and so our 2d BC so our 2d BC came about as we sort of you know we did year one of here's project reactor we did year two which was here's all the stuff in spring and as we start working towards year three and what our reactive story is we start saying okay we've we've sort of taken down the low-hanging through it fruit but we haven't you know there's still these places that stand out that that aren't good enough and so we had some meetings traditionally at the end of spring one there's always a big engineering team meeting we're all like the spring team's massively distributed so we don't really get to see each other in person very much but this is one time where we all get in a room we all have you know like what we're thinking about for the next year and the letters are 2d BC were said the idea was sort of put out there and nobody you really wanted to do anything with it we have engaged with the JDBC spec team around a DBA which we'll talk about a little bit later and we thought we might have some traction there and and it's and some time went on and so when I said a little bit earlier like I can't believe this room is this full it's because like we didn't think this was actually that pressing of a problem that people were using no sequel or they were dealing with this and some other way but then sort of December rolls around work slows down a little bit I'm stuck at my in-laws house and didn't really want to be talking to people and over a couple of weeks we sort I sort of sat down and said let's do this let's see what our 2d BC is just sort of in my spare time and I am absolutely astounded and amazed and absolutely appreciative that you guys have all come here to hear about it so our 2d BC comes from this idea that we sat down and said jdbc can't work there's absolutely nothing you can do to make a blocking API truly reactive that's just sort of first principles so if we were freed from that if we were freed from the restrictions of we need some kind of backwards compatibility and we are the experts in reactive programming in Java what would we design right and there's really two components to this one is taking a look at like what does a reactive API look like and you need some experts in doing reactive programming but also what should database access what should relational access look like in Java and for this you need people with you know decades of experience dealing with JDBC and dealing with enterprise customers and dealing with JDBC so we came up with a set of design principles first and foremost we're going to utilize reactive streams types and patterns and we're gonna see this all throughout the API today second must be completely non blocking all the way to the database this is a key thing and one of the places where excuse me a number of other attempts that this kind of thing have broken down simply wrapping a JDBC driver in a in a thread pool is insufficient we'll talk a little bit later why that is three and this gets to that that that second audience that we're talking about we want to shrink the driver spi to the minimal set of operations that are implementation specific regardless of their usability okay we'll talk a little bit on the next slide about exactly why this is key and then finally we want to say that there should not in the way that in JDBC there is sort of an official client for how you deal with JDBC and that has turned out to be so bad that everyone has chosen to build a million other clients on top of it we want to we want this separation between the driver SPI being very very small and tailored directly to drive our implementers while enabling all the functionality required for you all to write exactly the client you need so the driver SPI is I say one of the biggest failings was we had the same API that was that driver implementers had to sort of do to make to enable these rich clients to sit on top of it like JPA or Juke or something like that but then it also had to be the one that if you chose not to use us if you just want to stay in a sort of plain vanilla you as a user also needed to use that same API we had this conflation between things that were good for machines and things that were good for humans no one liked it right if you I have spent some time recently with JDBC driver vendors super not amused with the JDBC API you got to do it but it's not the easiest thing to do whether it's implementing a bunch of methods that they don't really need because some database at some point needed them or some client the thought it was a good idea to get this in there or duplicating functionality that all driver vendors have to implement but cannot reliably share with one another and the canonical example of this is a humane affordance like the question mark that we've all used in JDBC queries has to be implemented by everybody even though the way it works like turning this thing into an index into into an SQL query works in exactly the same way in all of these databases everybody gets to implement that from scratch with just slightly different semantics just slightly different bugs and it's you end up with a thing that's just really really painful for for driver implementers to do so I'm gonna go through a couple of the major types the the driver SPI itself is relatively small I think including enums interfaces and types it's about a dozen classes or so and I'm not going to go through all of them and I'm gonna hit the big ones there's connection factory that hands you back a connection using a mono to return it so it's lazily initialized you don't actually ever even call for a connection until you need the connection to actually send data so you could potentially write class that calls for a connection on startup and that connection is never actually provisioned for a couple of hours maybe but the connection is sort of where the rubber meets the road the core of what makes RT DBC and you're going to look at this and a lot of actually these methods I am a stickler for for verbosity I like things to be very clear very well named I not paying by the character or anything we all have autocomplete and our IDs at this point so I like relatively long names but it talks about this I the the the design principle that we need it to implement the smallest possible thing that is database specific exactly how transactions are created exactly how they're committed what it means when you're trying to do batch product like a batch insert or something like that how save points and transaction isolation all of these things work a driver implements this core of what do you think that is about a dozen methods and that turns out to be the core of what a JDBC driver does uniquely from all other JDBC drivers I'm going to skip what batch does because it's super simple and looks relatively like this statement thing statement for those of you who are familiar with the JDBC driver there is a statement thing and then there's a callable statement a prepare both statement and this turns out that the semantics between the two aren't really reflected by databases any longer they have this artifice that they all basically channel back to the exact same call that looks like a prepared statement effectively in their underlying wire protocols so collapse all of that down and say okay here's a statement and a statement when you create one of these things you're typically going to put some sort of SQL query in and you want to have the ability to bind a bunch of things to it right you want to bind by we have object identifier so if you're in Postgres is the one we're going to talk about the most today identifiers natively in their SQL variant are dollar number right so dollar one dollar two so you might put that in there or if you're going on to Oracle and they use named parameters maybe it's : foo and then whatever the value is binding to a numerical index binding null where you always have to have a class type so those sort of get unique things but then there's just add method that sits at the top and this is sort of the concept that all statements should be prepared statements realistically once you have one of these things and it's negotiated with the with the database you want to be able to bind a bunch of parameters and say this is going to be my insert statement add that into queue bind a bunch more added to queue bind a bunch more add it into a queue and then finally executing and executing where inserts for example generate keys that need to be returned auto generating keys that execution returns a result and a result is relatively simple it'll give you again a lazily initialized number of rows that are updated or it will effectively hand you an iterator so we would refer to this as a mapping function I'm using a bi function here where you get a row when you get metadata about the row so you get here's this thing that has all the data in it here's this thing that tells you about the types it was stored inside the database and you are responsible for using that to convert it into T's whatever object you want them to be maybe you just returned the row because you want to use it a little bit later or maybe you're the kind of person who wants to do a manual object mapping in here these are sort of the primary things people want to do is get my data and a function can be just an identity function that returns the initial thing or they can do some sort of limited transformation you have a question so the execute returning generated keys will hand you a result this may be sort of the kind of thing where this is a distinction that doesn't actually matter once we start implementing other things but certainly right now we think that the idea of doing an execute like you could imagine an insert statement with execute that didn't return anything would just end up with an empty result set versus an insert statement where you're intending to get a bunch of generated keys back does actually return a result set with those generating keys in it yeah and then finally there's no header for it the get at the bottom is basically here is my row I'm going to take an identifier go get me a you know go get me column foo from this row and I want you to convert it to this particular type and database vendors in all of sort of the JDBC implementations have a canonical list of what they think their date it's SQL data types map to in the Java world so to take a look at what some of this code actually looks like and there are some example projects we'll take a look a little bit at the end at the the github repo and I can show you where some of this stuff is you have a connection factory create which will give you a mano of a connection as I said and then we typically do something like a flat map mini we get access to the the connection we create a statement and we execute that statement the thing that is returned that result we then map in some way and we say here's my row and metadata I want you to go get me the value thing so the net result of this invocation is only at the end when somebody subscribes it goes and grabs a connection execute the query and for each one of those rows I'm going to return whatever the value is let's call it an integer right it's just 1 2 3 4 or 5 and this results in a flux of integers as the end result with a life cycle around the connection it's only on subscribe and closed off once it's complete this will return a flux right here because of the flat map effectively at the end that turns a single mono result into a bunch of rows effectively or a bunch of things mapped out from the road behind it yeah absolutely it's a it's a it's a good distinction to ask there are places in here where you might get mono so instead of asking for the result dot map if we'd gone for just give me the get rose updated you typically get a mono of that integer back because it's only a thing one of the things that this this does point out you'll notice the spec is written in terms of this thing called publisher right we're and yet I'm talking about fluxes and Manos with the audience right here the spec itself is written to reactive streams reactive streams is sort of a neutral third party that defines publishers and subscribers and things like that you know it's sort of a specification that allows Interop between multiple different react their streams implementations but the reason it's critical to use it when defining a spec like this is there is an equivalent from Java 9 on called flow and flow contains the exact same types with the exact same semantics and the exact same definitions as exists in reactive streams the implementations that we do well we're a project reactor house obviously and so we have flux and Manos and we return those directly if you are also a project reactor kind of place but if you're not the fact that I have returned Amano doesn't matter it's a publisher it means you can translate to rx Java or something like that maybe we want to do an insert and this is sort of a demo of that binding that we talked about a little bit earlier got our connection created our statement binding to the two values do an add there is an implicit add when you do an execute at the end so you can do it manually if you wanted to but it effectively results in inserting two rows into this particular application now we start taking a look at the more complex stuff prepared insert inside of a transaction so now it starts getting clunky and like on I'm confident enough to say this is not great this idea that connection begin transaction returns a mono void basically it signals yes the transaction has been opened then you have to then many with this thing which was the thing that existed outside the transaction then we have to commit the transaction if something goes wrong we need to roll back the transaction and propagate the error and stuff like that and this is where we start to see this distinction that this API while super concise and focused on implementers isn't super humane because clearly you want to do this right it's a bit verbose and especially that bit at the bottom it's gonna be prone to errors this is but this this set here down at the bottom it's delay until on error resume its effectively try catch finally try catch and I suspect there are some JDBC developers in here who are young enough who don't remember the bad old days where every single bug was because you didn't do try catch finally try catch properly with JDBC thank God you know we've got things like auto closable and JPA to take that off our hands these days so what we actually need is a humane client that gets built on top of this but more importantly we don't need one I can't decide for you as a certainly as a spec lead you know a perspective spec lead on this I can't decide what everybody's gonna need I don't have the same programming styles you do clearly I like really long method names I am sure a lot of you hate really long method names and so what we want is sort of a thousand clients to bloom around this kind of thing so I've got some ideas the very first example we saw there are two DBC with handle handle select this thing this is a slightly different variant because it doesn't give you like row metadata and things like that it's somewhat simpler I don't ever deal with running execute I don't ever deal with map getting the row map or back from the result or anything like that it's all hidden away you can get to it if you want to right that's the whole point of having this lower-level SPI especially if you're a client implementer this is implemented in the terms of those of the the driver SPI but it says most people probably want to run a select statement and map those rows out and will take care of everything else that goes on inside of it what about that insert that we saw a little bit earlier create an update bind bind or buying behind add bind bind execute so you still get those same kind of behaviors there because binding takes months multiple things we obviously need to to expose execute in this case one of the keys is this execute doesn't sort of give you that access to the result it assumes that most inserts just want the number of I think in reality for most of us we don't actually care even about that number right you just want to insert these things and yeah it's good to go so final one is what about that transaction with all that sort of you know delay until on error resumed behavior at the end what if it just look like this what if because of how these closures work we could bake that same boilerplate into a client so now we just say I want this to be in a transaction there's even a field on the before the closure and an overload of this that will allow it to take your transaction isolation level that you want to use there right I want to execute these always serialized or something like that and so now you start saying this is actually interesting right this is a client that is humane I haven't subjected every single driver implementer to having to implement that exact same boilerplate about how to wrap this thing but as we said a little bit earlier and especially if you were in Christmas talked beforehand this is sort of JDBC template level stuff and that's most people these days that is not your speed right everybody wants something like spring data repositories so what if there was a client that look like that right find by last name it's got some query assigned to it returns a flux of customers so that you can just call it and you get a flux of these things back this is an example of yet another client that can be built on top of that same driver SBI we have this today obviously it's in it's in Lovelace and to boot to one and four as well so what about alternatives we talked a little bit earlier about JDBC in a thread pool if you tried to implement this with JDBC inside a thread pool there are two problems one there is no such thing as back pressure right every single one of those behaviors and we saw every single API that you might call has this underlying implication that when I am a client and I call that database and I'm like give me a hundred million rows the database has no way of ever sending a hundred million rows to me blowing up my heap and taking down my VM what happens is I ask for eight and then it was responsible for holding off and only sending the eight when I'm ready for them maybe you want bigger numbers there to configure these things maybe you can handle 64 in a row or 128 in a row or something like that but one of the really interesting side effects of this is it actually helps databases themselves as well because a well-written driver like this will actually make it so the database never even materializes those roads they're literally never written read off the hard drive in the first place and so rather than having a bunch of people all asking for a hundred million rows simultaneously and then having to materialize and transmit across TCP you know say five times 100 million rows what ends up happening is everybody gets to make that exact same query and now it only reads enough from the hard drive to send five times eight rows right and that's this underlying implication so you could never get any of that if you were just using a JDBC driver but beyond that the cue this this pattern where people think oh reactive means I just need to be non-blocking and so I'm gonna wrap something in a thread pool what ends up happening is the cue of you trying to get things into that thread pool either is unbounded at which point you end up with requests that explode you you know basically take out your heat because you've stored a hundred million pending requests or that you choose a bounded u which is a hell of a lot safer but that's going to actually lead to blocking it actually makes the whole system back to blocking because when you try and insert that 64th 128 or 100 million through requests it hasn't yet been serviced it will eventually block that thread that you made a call and as we said earlier when you're doing reactive programming it's not like you've got 200 threads to play with you got 4 threads to play with we got 8 threads to play with so blocking one of those really really bad idea a DBA the elephant in the room or the elephant emoji in the room I think those are a little bit smaller we have as ali mentioned today we engaged pretty early with the adb a crew on this but there's a lot of contention whether or not completable future is actually reactive I'm not here to sort of crap on a DBA I'm happy to talk about the technical issues later suffice to say that like it's not reactive it can't do back pressure doesn't do those kinds of things and we have-we sort of disengaged and that led to the r2 DBC effort but now that the r2 DBC effort has actual proof and working api's we're being invited back to engage again and so a DBA may yet become this and I think you know sort of to some extent that is the end goal of a project like this so safe harbor statement this basically means I'm about to tell you something that doesn't exist artoo DBC was originally envisioned as an API playground as I said at the beginning we effectively said if we were not constrained what would we design and more or less that means what would I design personally there has been some input from the spring k2t after the initial one was made but it's still up for discussion and it's so great to see so many people in here so interested in this because it gives you an opportunity to have a say in a spec like this and I dare say more of a say than you would have if you were trying to influence the jdbc spec we obviously so the code itself I think is absolutely great for you to start testing against please for the love of everything do not take this into production okay not production but it is stable enough for you to start writing applications and testing on something like this there are a lot of edge cases for example no blob and CLOB support right only Postgres support we actually talked to Steve Steve Guri the Facebook guy that was on one stage accepting offhandedly mentioned last night around 2:00 in the morning when we were all still working on our slides oh yeah I've totally implemented a non-blocking asynchronous Maria DB driver hasn't everybody and so like we actually have an avenue to getting a Maria DB implementation of this quite quickly probably want at least one other client who knows what it's gonna be I have already been talking to someone over Twitter that I have invited to contribute a driver for their database the big takeaway from me doing this kind of work it turns out to 80 based drivers not that hard I have heard you know I've been doing Java programming since the late 90s and I had always assumed to do a JDBC driver you had to be the smartest developer in the world most of them are actually not great inside anyway we have a lot of legacy cruft to them but it's actually not that hard so if you think you how you know you have a database that isn't likely to be tops on the list you can do it and I'm happy to help right I'm on Twitter the github repos we're going to see in a little bit I am happy to help you make a contribution to this project for a database that matters finally spring doesn't generate specs right we aren't spec leaves we don't host specifications we're happy to be a de facto specification in the case of Spring Framework but realistically this projects entire goal is to become a way to influence the ADB a spec that is sort of the the best possible scenario but make no mistake I am NOT a person who's going to tolerate the a DVI spec being bad right if they don't take our advice if they don't see that reactive is different from being a sync this is something that the spring team will do right there's clearly enough demand in this room people want to do reactive relational data right they need the reactive it being asynchronous is not sufficient and we will make this a project and we will implement drivers and we will work with database vendors to guarantee that you the community have this kind of implementation so finally our two DB CIO let me pull up a quick web page here go away so yet you can the easiest way to get there is our 2d be CEO written very very tiny right here it takes you to the github org the main three things you're going to see there there are a couple of other things around CI and stuff like that is the r2 DBC SPI which is as you'll find out very small bunch of interfaces basically and lots of documentation written against the reactive stream specification with an eye towards it should actually be against the flow specification the r2 DBC client which basically could have any underlying strata underneath it you build some sort of client this is one potential client that looks a lot like JDBC template we want to see a lot more and a Postgres implementation and I would love for this list to start moving right with other databases underneath and I would love all of you to come and contribute and say hey here's a use case that I've got the Year API your drive or API or your client API doesn't actually support today we want to make sure that it can do this thing that I absolutely need and with that let's call it a day thank you all very much for coming I'm happy to take questions up here [Applause] you
Info
Channel: SpringDeveloper
Views: 9,439
Rating: 5 out of 5
Keywords: Web Development (Interest), spring, pivotal, Web Application (Industry) Web Application Framework (Software Genre), Java (Programming Language), Spring Framework, Software Developer (Project Role), Java (Software), Weblogic, IBM WebSphere Application Server (Software), IBM WebSphere (Software), WildFly (Software), JBoss (Venture Funded Company), cloud foundry, spring boot, spring cloud, r2dbc, reactive programming
Id: idApf9DMdfk
Channel Id: undefined
Length: 33min 8sec (1988 seconds)
Published: Wed Oct 03 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.