Rich Hickey: Deconstructing the Database

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

If you like this, you might be interested in the keynote this video is extending:

http://www.infoq.com/presentations/Value-Values

and other Richies talks, like

http://www.infoq.com/presentations/Simple-Made-Easy

http://blip.tv/clojure/hammock-driven-development-4475586

http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey

(shamelessly copied from the discussion on HN)

👍︎︎ 7 👤︎︎ u/sideEffffECt 📅︎︎ Aug 29 2012 🗫︎ replies

I would like to see a blog engine or wiki built on this database, and compare the lines of code to a solution in SQL.

👍︎︎ 5 👤︎︎ u/seunosewa 📅︎︎ Aug 29 2012 🗫︎ replies

Awesome. I'm relatively new to Clojure/FP, but I really like how he puts the same ideas onto a database model.

👍︎︎ 3 👤︎︎ u/naeg0 📅︎︎ Aug 29 2012 🗫︎ replies

I really like the underlying ideas. Do any of you know any alternatives to this? I mean a database that stores triples, has explicit support for versioning and a transaction semantic?

👍︎︎ 2 👤︎︎ u/occio 📅︎︎ Aug 29 2012 🗫︎ replies

With nothing being deleted how would his system handle something like a update statement that accidentally runs over and over again? For instance while(some_alway_meet_condition) update myFact .....
What would be the equivalent of Select ItemID, Quantity From OrderLines Where OrderID = 12345;

👍︎︎ 2 👤︎︎ u/jcriddle4 📅︎︎ Aug 29 2012 🗫︎ replies

So noob question, how can I start experimenting with this new structure? Is there or will there be a DB software made like this, or can it also be done through implementation of existing ones (I think he mentioned that briefly but not sure).

👍︎︎ 2 👤︎︎ u/[deleted] 📅︎︎ Aug 29 2012 🗫︎ replies

Rich Hickey... This man is friggin' genius!

He actually extended his old identity/value/perception concept to databases. Also I admire his choice of what NOT to implement/reinvent: he chose to re-use already existing storage solutions, because he noticed that already existing ones are actually quite good.

👍︎︎ 2 👤︎︎ u/zarazek 📅︎︎ Aug 29 2012 🗫︎ replies

Captions

[Music] let's not talk is that deconstructing the database and what it is is it's it's a look at a new way to look at database architectures it happens to also be a look at the underpinnings of the architecture of des Tomic which is database I've been working on for the last two years but it's not a sales pitch for diatomic it's really about the ideas underlying the design choices so why do we want to deconstruct the database what are we trying to accomplish what problem are we trying to solve and I think the fundamental problem we're trying to solve is the problem of complexity in programming how many people think dealing with databases is easy and trouble-free nice most people don't and there's a number of sources of complexity there's this great paper out of the tar pit and in it the authors sort of identified a bunch of problems related to complexity in programming and and they basically said that all complexity boils down to you know two two flavors one of one has to do with state and the other that has to do with control and the authors didn't implement it but they suggested that by adopting functional programming and declarative programming and a relational model for data inside our applications we could get rid of get rid of this complexity but the and it's a great paper I really recommend you read it but one of the problems with the paper is that while they had a good grip on the functional programming and declarative programming part and also possibly on using a relational model for data inside your applications they really sort of punted everybody seen the cartoon where the mathematician has this chalkboard and it's full of full of stuff and then in the bottom corner it says and then a miracle occurs right and then there's the answer and the big thing I was missing from their picture of the was they imagine there would be this relational model of your data that you could access in your application and that somehow it got updated like somehow there was something happened in the world and it was different and and then all the all the ick relates a state had to do with however that got updated but it didn't say how it would and I would call that updating and the problem that they avoided talking about really the problem of processing our programs in other words we know there's going to be novelty in the world that our programs are going to encounter where does that go in a model that's otherwise functional or less functional as we could make it so some of the things some of the other problems were trying to solve are things we want to obtain in INRI in looking at the database in a fresh way I think is to embrace declarative programming I agree with the paper we want declarative programming what's the best example of declarative program we encounter most often if we're not artificial intelligence researchers sequel actually it's the most declarative thing most of us encounter on a day to day basis and and and it ends up that declarative programming is much better at manipulating data than what we do in our languages even functional languages write in our languages we sort of go through stuff we do not have this very nice higher-order set logic for dealing with data and it is superior to dealing with data even in a functional language and way superior then dealing with data in an object-oriented language we use object-oriented languages because we have them not because they're better at this there really are much worse at this well the problems we have is with a client server database that declarative programming is something that's alien to us it's sort of over there the other problem related to the model these guys were espousing is that there's no basis right that you could rely on in lose if I want to do I want to calculate something related to a database what's the basis for that calculation well if the whole database is sort of changing all the time we're back to the problems I talked about in the keynote right the database in its entirety is a place and we have a problem of saying what's the basis for our decision making well I don't know it was what where I saw last Tuesday when I ran this computation but I can't can't tell you now what that was exactly there are problems with databases related simply to the client-server nature about them being over there the basis problem is one of them the other is that our fear of round trips and where fear we're afraid of round trips I think most often for performance reasons but actually the biggest round trip problem is that same basis problem right what if I have a composite decision to make can I ask the database three independent questions over time and then make get my answer no why not because stuff has happened to that place in between those calls and it makes us do weird stuff in particular one of the things I think it makes us do is couple questions with reporting right so let's say you have your application and your application makes a decision about what entities are we're going to put on sale or display on this webpage do we ever you know do we send the query to find out what those entities are and then later send a query to you know gather the data we need for display no a lot of times we piggyback those two things together because we're afraid of the result sets not matching up anymore that's actually a fear that's born of the lack of basis again and this is the biggest problem with round trips and of course from a design perspective we know those two pieces of logic should be separate they should be independent decisions one part of my app has knows the logic for deciding what should be displayed and the other one knows about what was the screen look like and what you know what do we want to show of it we have problems related to consistency in scale I don't know if anybody saw any of the no sequel talks this week but you know a lot of times we have difficulty scaling servers that are monolithic by default and you know we've seen no sequel we've seen the dynamo paper and some of these other technologies and I think one of the questions we have in revisiting the architecture of a database is what's possible right how much of the value propositions of databases can we retain while tapping into some of the new value propositions of distributed systems in particular they're arbitrary scalability and elasticity also I think people are sort of adopting these distributed systems and getting a bunch of complexity as a result because they're trading off distribution and scale for consistency they're losing consistency in the in the in the trade off but we have things like dynamo and BigTable how do we use them other problems we have in general when we talk about traditional databases or flexibility problems you know everybody knows the rigidity of relational databases and the big rectangles and you know the the artifice you have in having to form intersection record tables and things like that things that you really shouldn't have to know about and your application ends up becoming rigid because it does know about those in addition lots of things are difficult to represent in a traditional relational model like sparse data or a regular data hierarchical data and things like that so we want to be more flexible we want to be more agile in our development and we want to try to avoid this rigidity seeping in and of course again related to the talk earlier another thing we want to try to get right if we revisit database architecture is information and time in particular we want a database that we can use to represent information we can use to obtain real memory and real record-keeping like we used to have before we had computers and there's lots of good reasons for this it's helps support decision-making as I said in my talk before and auditing and there's plenty of domains in which it's a requirement and people are doing this manually on top of systems that don't really understand that that's what you're trying to do I mean how many people have ever added timestamp field themselves to tables and managed it all themselves that people have right how many people have written the query that gets you now out of that table how many people have tuned that yeah that's a nightmare anybody like that query tuning that query is brutal right the contention is terrible especially if it's a it's if it's also an online system and the last thing I think we'd like from databases that maybe we don't think about now because we don't connect the two necessarily I think is model for perception and reaction my perception is part of what I was talking about before getting that stable basis for decision-making reaction is more like a venting right things are changing in the world how do we see change in a traditional client server database what do we do it's the big four-letter where it begins with P and ends with o ll Paul we pull very gross right so we'd like to be able to make reactive systems that don't pull and we'd like those systems to get consistent views of the world which is another difficult thing even if you build even if you build manually say a trigger based eventing system right people have done that all right so triggers and they say oh that something changed it's like okay great well in between when they told you something changed you and your wanting to make a decision on the basis of it maybe you're gonna go back to the database what's the basis now for that did it change again between when you you know what's in flight you have no way to know this changed and that change was related to the database at this point in time and you can go back and ask questions to figure out what was going on and what either caused that change or what the effects of that change should be or how it relates to the rest of the world you have no way to do that you just were told something changed and maybe a value about it but not where that's situated relative to the rest of the world so we want to do that better so if we're going to take a database apart we have to look at how its put together and this is just this is not a particular database we're talking about here this should be familiar though it's anybody who's dealt with traditional databases and how they're laid out and sorry this font is too small and and the guts the meat the meat of it is at the bottom so we'll start at the bottom a database certainly is something in general we expect to be durable so most of the traditional databases are built around there is this disk and we're gonna we're going to put that disk in a box and that box is going to be in charge of the disk and everything comes from there so there's some IO subsystem that deals with the disk right and then there's sort of three fundamental well two fundamental sets of services the database will provide they both rely on the this third the first is there's a transactional component that accepts novelty and integrates novelty into the view of the world right and then there's a query support component that accepts questions and gives you back answers based upon the values here and both especially the query side it needs to have storage like if you just took everything that came into a database and you just appended it to a flat file how good would the query engine be not very right right so leverage comes from indexing leverage comes from organizing the way we store the data such that a query engine has sorted views of things that it can use to answer questions quickly and that's the leverage of a database I think we've gone to key value stores that have almost no leverage and we're still calling them databases but in my mind this is what made a database to database otherwise we had file systems and all other kinds of things before we had databases and we didn't call them databases why are we calling key value stores databases now so in general traditionally this was a big monolithic thing there was a big sophisticated or complicated process that knew about all this stuff and it had an integrated view of how they would work and you know because this was expensive and the memory needed was expensive in the box on which it ran was expensive this was a very special thing you had one of them and then you had clients which were somewhat more lightweight and they communicate using usually a foreign foreign language right so how do you communicate with a sequel database strings in the foreign language you send it over and it does something right same thing how do you communicate queries you send strings over in a foreign language and you get back well who knows right maybe it may be the API makes it look like a result set to you know up at the Java level and then we know as we get a lot of apps going this unique resource gets taxed right it's getting everybody's ass everyone's putting all the data in there and everybody's asking questions there we know the questions dominate in most applications are read read oriented so most applications eventually end up adding another tier right so if it's very costly for me to ask questions I'm going to store the answers to those questions in a cache so maybe next time I want to ask that question I'll check the cache first otherwise I'll incur the cost of going all the way to the server and what goes in the cache what form does it take when does it get invalidated whose problems are all these questions yours your problem or maybe you buy into some fancy or and that makes it your problem with another layer on top of your problem now you have two problems yeah it's up to you there's no there's no who knows it's definitely not the server's job right work this I would call it caching over a database and there are some other things that database comes with we don't necessarily think about certainly most of these databases most databases have a data model it can be a really low level thing that is about how things are stored or an API kind of thing or can be a relatively high level thing I mean it's certainly great trait sequel databases that they're based upon a mathematical foundation in relational algebra there's a proper data model with a bunch of great characteristics that allow you to write that those declarative programs but they also contain a state model and in fact relational algebra is a lot like that old paper my relational algebra is like perfect it says there is the state of the database and all this algebra applies this math it's great all right how do you get a new state of database well a miracle occurs and then you have a new relational world and then you have that but update is not mathematical and there's there's not the same model behind it so there is a state model and in general not all the time that's an update in place model and it's subject to all the kind of criticisms I gave in in the keynote what's usually missing is an information model and here I mean something precise that I said before I'm there's anybody not in the keynote because I'm just referring to it in everybody it's okay great and by an information model I mean the ability to store facts to not have things replace other things in place to have some temporal notion to what's being stored that's what I would consider a true information model that's usually missing from the databases so we want it we want to solve all of this we want more scalability we want to try to leverage these new systems we'd like to have more declarative programming in our applications we'd like to have a proper information model maybe we don't want a program with strings anymore what are the challenges we're gonna face if we tried to do that with this approach the biggest one by far is definitely the the state model the fact that it's update in place and again as I said in my talk there's a great reason why traditional databases work the way they do because when they were invented 30 or more years ago these things these resources were scarce that you couldn't make a database that said I'll just keep everything you know because you had like this tiny little disk so they invented all this update in place technology usually inside a database there are these bee trees they use blocks on the disk they'll reuse the blocks they'll fill the blocks they're rewriting them they're usually interacting at a pretty intimate level with the memory management on the computer and because they're updating in place and they're trying to serve multiple clients they have a huge amount of coordination overhead to do that and that slows them down significantly so the approach we're gonna take in trying to break things apart is gonna be based on these sort of four three three principles one is to move to an information model now I made claims during my talk that using values and having an information model has architectural implications and and if you take away nothing from this talk I would I would hope you would take away that is a real example of that that in action that adopting a value oriented model has architectural benefits really substantial ones so we're going to move to an information model we'll see how that plays out we're going to split process and perception and I have a diagram later that will make that clear we're going to treat our use of storage immutably in other words we're gonna store stuff but once we've stored it we're not going to change it and the other half of doing that is that in order to deal with process we're gonna have to manage not novelty in memory for a window of time and now I'm gonna break all these down so to move to an information model means to move to a data model that is fundamentally about facts so we're gonna say we're gonna have a database of facts and we're gonna that means sucking the structure out right because when you look at a relational row or a document there's nothing fact like about that right it doesn't say when and the granularity of the fact is this composite thing so you know if I had a whole row for you and you change your email and email is one of the columns you know where the the row is not a fact it's not the granularity of the fact it's bigger than a fact right maybe it's a set of facts so we want to get down to single facts that's going to be important for efficiency reasons but it also dramatically simplifies stuff so how many people know what RDF is not too many okay so RDF is an attempt to have a universal schema for information and they use something called triples which are subject predicate object I argued during the speaker Summit Sunday that that's not enough because it doesn't let you represent facts because it doesn't have any temporal aspect so we should take a step back from that and say that's generally a good idea it seems atomic we really do want atomic facts we label them we call them datums and we just spell it differently so we can say datums because if we spelled it dat um the plural would be data and then we'd be into people don't know what that means even though they say it all the time so we have datum which is an atomic fact and atoms are more than one fact and it's just an entity and attribute of value and then some temporal component it ends up that you could just put the timestamp there but that doesn't give you a lot of power if instead you say well when this thing this thing came in as part of a transaction I can sort of the transaction there if your transactions are first-class and they are in this system you can put the time on the transaction but you can also put who set it on the transaction or where it came from or whether or not it's been audited or any other kind of provenance or other characteristics so that's what we do but you can read this for the purposes of this talk because I'm not going to talk a lot about that other stuff as that tea part is time it's a path to win and that's the smallest fact okay so now we have the problem of the state the database state problem we say we want to have the database fundamentally be a value we want it to be immutable it seems to be a contradiction in terms because we know there's going to be novelty right our business is gonna run and we're gonna sell more stuff or get new products or have new customers and that novelty that newness has to go somewhere so how does that jive with the notion of a value and the best analogy I could come up with was a tree ring one right if you think of the database as an ever-expanding value so it never updates in place it only expands it only grows outward right that it grows by accretion of facts we're just going to add more facts we never go back inside just like the tree rings you know go back inside the tree rings you just add you add more rings we end up with something that really smells like a value and will function as a value right we're only accreting the past doesn't change so the core upon which we're building never changes and that's really the key characteristic we expect of a value anything I've seen before will never change the implications are as we have novelty you know new stuff means new space like I said in my talk and this this is sort of super key we're gonna get a of freedom by moving away from places right at this point the other problem we have is how do we represent change so we're gonna we said we're gonna creep facts so what is what is the granularity of change right we're used to saying update this place and here's the address of the place or here's the primary key of the place you know go do something there if we don't want to say that anymore if we just want to accrete facts then what is the fundamental unit of novelty you know I have a new customer what am I going to say to the database and what we're going to say is that at the bottom we can represent this process right so this is the problem we're trying to solve right the novelty problem we're gonna say we can represent novelty but just as assertions or retractions of facts right this new thing is true this new thing is true that thing that was true is not true anymore okay still a fact that it was true from then to here it was true and this ends up being the minimal possible representation or process right with this you can build on you can express anything in these terms and so we'll say that all the other transformations will expand into this and I'll show you that a little bit later the other key thing we want to do with process is we want to reify it how many people have heard of like event sourcing or anything like that so one of the ideas behind it is that if you talk about a database and you just look at a database that's been running for a while now it's had a lot of activity when you look at it and you want to know what happened how do you figure that out how do how did it become what it is you have no resources for doing that right me unless you know how to read the logs right maybe there's a transaction log and maybe there's a way to reread that but a lot of times you're gonna have to sort of replay that because that's just this arrested that's just a successive set of modifications to places it was really hard to read that and understand what happened if instead we say we're gonna reify process when you add a new customer there's gonna be something that says there is a new customer that customers name is Sally that's what's going to be in a reified version of process that says processes just assertions and retractions of facts so that's great so we want to make a thing out of that because that's something that we could store we can look at we can understand when we look at it this this is the group this change happened we added this we retracted that email we added a new email we sold this we did this we did this fact fact fact fact fact these things happened right it's an information system back back back back fact and that's gonna be great because that's going to let us do some other cool things later like events the accretion process one of the things that is important to understand is that it really does add to what's already there so that means that if you ever look at the view of the database at any point in time you will be able to access the past it's still inside just like the inner tree rings are still there it's not like there's a snapshot from last Tuesday and then one from you know Wednesday and one from Thursday and one from Friday and each one has more stuff in it it's as if anytime you look at the database it includes all of history inside of it and this is important for an information model where we want to give people decision-making capabilities that say how much that things changed in this window of time or count how many of those that happened over a window of time we need all the time in one place we don't want a bunch of independent records here's Tuesdays FAQs on Wednesdays and Thursdays separately so we had this growing tree ring thing all right so that's our plan how do we do it we're talking more now at the at the model level right we want to deconstruct this now we're looking just the server component right indexing transactions query i/o and disk and I think you can divide this up into halves and that's why I said before we want to separate process from perception right because there's a process part of this right novelty processing I have a new thing it has to go through a transaction processing thing may be indexing happens on it then or not don't know it's an open question at this point in the talk and then there's output to the storage system right completely independent of that is a perception characteristic to the use of databases right I have a question I want to ask the question maybe that leverages indexes almost definitely it does and there's gonna be input this is all process relative input to me right as I read back from storage we can separate these two things out and we can only because we've adopted an information model and immutability so the model we're trying to get to and again this is not yet a physical model is one where we can empower applications and you can read this as application servers or analytic servers or anything you want it's like a desktop application right we want to empower independent applications with as much of these capabilities as we can we want them to be able to perceive change we want them to be able to react to it we want them to be able to independently remember anything that's important to their decision-making process and we want them to make decisions and then possibly affect the process that they're sharing what we want to do and obviously there's gonna be some shared resources here right there's got to be some coordination really around change right and there's gonna have to be some shared resource around storage right we want to minimize the coordination that's necessary to support this but that's the model we want because now if we can do this if we want a more powerful system what do we have to do for one more query capability what do we need to do just add more of these guys and we don't really care about this growing because we're not asking it to do much just some coordination so if we revisit that whole tree ring thing when now we're into implementation details how do we represent state how do we represent this immutable expanding value right we know one thing whatever representation we use it has to be organized to support query right has to be sorted in some way that's that's really sort of the fundamental leverage capability we have is sorting things and it ends up that there's a technique that can be used you know it's be used in functional programming quite often called persistent data structures and that word persistent there does not mean durable it's a different notion of persistence and it has to do with the fact that you can represent a large immutable structure and it doesn't matter whether it's supposed to represent an array or a map you can represent almost anything or sortedset would be another example we're going to use as a tree and you can and you can represent that immutably by using something called structural sharing so this tree can represent anything it can represent for instance the sortedset and this is the view of it we have right now and it contains all these nodes if we want to add another child to this node so we're going to make a new version of the set but we don't want to update it in place we're going to need to allocate a new node for that leaf and then copy the path to the root then we have a new tree that has this new piece of information in it and substantially share structure with the old tree because and we can do that because why it's all immutable right so this is the underpinnings of what are called persistent data structures so we can do this in memory it's done in memory by most functional programming languages what we're going to start to do though is do this on disk so we'll have a really persistent or persistent persistent or I'd rather say durable persistent data structures that have this kind of shape so you saw an earlier diagram right we had a server and it had IO and a disk right I don't want to I don't want to know about iOS and disks anymore there was a paper recently it came out it said disk locality considered irrelevant right it's another one of these old notions that's now dying right it used to be boy if you're not the machine that has the disks you're at a tremendous disadvantage from a computational perspective because that machine it's got you know it's got a card that's attached to the disk you can get the data up into memory it's lightning fast that machine has a privileged access to that disk if you try to access that data on the disk from another machine you can be paying a huge overhead well now how much faster our networks way faster and what's the differential between disk and memory you it ends up that anybody that needs to access the disk is losing and the guy who has the disk in the same box is only losing very very slightly less than anybody else would lose so it doesn't matter anymore that kind of locality is not the way we should be arcing architecting systems we don't care about it care a lot more about putting data into memory and having good locality in the way we do that but if we actually have to touch the disk we're losing anyway so we're just gonna wrap up both the i/o and the end the disk and say that's something we're gonna call a black box called storage what we're going to put in storage are two things one is a log of every all the novelty as it comes in and that's really an appendage AAB right somebody says is it I sold this Sally gave me an email back back fact we're just to shovel those facts as fast as we can into storage the other thing that will be in storage are much the same shaped things that we used to have locally right when we had a database that was monolithic it had B trees on disk we're now going to have those persistent trees in storage with nodes that we just don't change but otherwise it's the same kind of idea we've moved away from disk to storage and we're storing index segments that we're not going to mutate but the key thing now is that we're treating storage and it can with it with a simple interface it's just a key value interface we say this block of the index tree maps to this segment which is a blob of stuff and that's going to be immutable so all we need from storage is the sort of key value thing it's a lot like the old databases used to say this block on the disk contains these you know bytes now we're just lifting it up to sort of a systems level cooperating systems level so of the storage we need a key value interface where we can store blocks of index under keys we also do need a little bit of modifiable storage for the roots so what's the current root of the whole database tree it's something we're going to need to point at new versions of the tree and the other characteristic we need storage is that we have to be able to obtain that using consistent read so we're now sort of getting a set of definitions for a storage service right one of the requirements of a storage service must support key value storage and every now and then we're gonna ask your first consistent read these are the two things we need out of storage otherwise I don't care how it works or where it is don't want to know I'm now getting architectural flexibility from doing that and there's lots of things that can satisfy this so you know we can sit on top of DynamoDB or our sequel database can satisfy those two things right you can treat a sequel database as a key value store stick blobs in it and it offers consistent read so an index is very simple thing it's a tree right there's some route it's got pointers to inner nodes which have then got point just to leaf nodes and all that's in these leaf segments are sorted datums so it's a big block of datums sorted in a particular order you can imagine the orders right we said entity attribute value transaction so you're gonna have a sort by entity right that's going to give you a great way to pull out what looked like objects or documents right because it's entity oriented but you also have a sort that's oriented by attribute first right same data just sorted a different way second copy of it sorted a different way one that's driven by attributes is gonna feel like what kind of a database gonna feel like a column store my column store store just email addresses all together and then they store just phone numbers all together and column stores are very powerful tools for analytics and you can store other other flavors but there's only you know six ways to sort them so then there's the actual job of indexing so we get novelty in we want to incorporate this we're gonna have these trees obviously once we have a lot of data those trees are going to be what huge and we said they're immutable so we said anytime you want to incorporate new information into that tree we're gonna have to do that path copying job do we want to do that every time we get a new fact no that's a disaster can't do that so I would call that maintaining the sort live in storage it's not something you can do efficiently if you're going to treat storage immutably so big data Google's big data is an example of a solution to this problem that other people have used as well and the idea is everything that's new you're already logging to storage so this is not a durability question it just has to do with how often do you integrate novelty into that big relatively expensive to create index the way BigTable works is it accumulates novelty in memory like until it's got 64 Meg's of novelty and then it blows that out onto disk and then a separate process later takes that 64 Meg's and the big sorted flat file that's everything it knew before and it does a merge sort and produces a new flat file none of the files ever get changed so it's the same idea they're all immutable the biggest difference between that and what I've been talking about is that we have trees of smaller segments and can share stuff when they create a new flat file it shares nothing with the older flat file and it's huge so it's not particularly addressable and it's not easily cached in chunks by using trees you get fine-grained adjustability and you get these nice small chunks which are good for caching you also have the potential for structural sharing right I can integrate new stuff into the next version of the index it could share a whole bunch of nodes with the old index and if they've been cached they're so good because we know they're not going to change so we accumulate a memory anytime we want to current view of the world we're gonna need to merge dynamically what we have in memory with what's coming from storage just to a dynamic merge-join between those two things and every now and then something has to go and integrate what's in memory into storage as soon as that's been done everybody can drop that from memory we don't care about that we're gonna start it fresh because we know now that's in the index that's in storage so that just looks like this whatever is handling transactions this is still pretty much a logical model but whatever is handling transit is going to take novelty and immediately log it right that's where you get your durability so if the thing dies it's somewhere but that's not organized in a leverageable way it's gonna put it into an index in memory will call the live index this is sorted it's very inexpensive to create you know a sorted set in memory right and then sometime later and occasionally some other process is going to go and take this and whatever's there and and merge but there's gonna be a lot more efficiency to that because now it's got a whole bunch of novelty that's gonna make a new tree that incorporates the new novelty and there's a lot of efficiency to sharing that job as opposed to doing it for every new new transaction so that's the process side the perception side is really straightforward right if I want to see what's going on it doesn't matter what this is this could be a query it could be an analytics thing it could just be an ordinary you know get me an entity if I want to see the current view of the world I have to somehow have access to the live index and access to storage look at all the stuff I don't have to have access to don't need to be near the transaction processing system I don't need to even have anything to do with it is there any coordination associated with doing this no something has happened in the couple of slides here what happened to read transactions they're gone where did they go they just disappeared because a correct implementation of perception does not require coordination perception in the real world does not require coordination so we do our left one with one lingering question which is how does this get updated if this is actually local to me where does that come from well that's that in a second okay so we're gonna have just a couple of names that may be new as we look at the real architecture one is the trans actor so we've broken stuff down now we have storage separate we have perceivers right who might be doing queries they're separate we're gonna and then something that processes and coordinates transactions we're gonna call that two trans actor we're gonna call anyone who has all those capabilities that perceive remember decide react stuff appear right they're equal they're very powerful equals in this system and they're your own application servers and then finally we have some sort of storage servers ideally it would be a redundant store of one of these newfangled storages right to do distributed redundancy there really have some great properties as long as you don't try treating like a database between them like a key value store they are awesome then we're done at the high reliable the highly available the scalable distributed there's a lot of power there we want to use that so this is this is the sort of the same jobs Oh rearranged for the purposes of talking about this I'm going to say that a trans actor will independent will occasionally do the indexing job and and it does you know in in the in the current implementation as well just because it has the spare cycles and it's convenient to do so but anytime you want to you can move this to a separate box so we'll start we'll start with some novelty your app has some novelty it needs to communicate with the trans actor right trans actors gonna log that right away it has its own live index so you can just imagine that's here it's going to put it in there and the other thing that's going to do is transmitted that novelty to any of the peers and that's how the live indexes are kept together right it's just a reached a rebroadcast of the of the novelty and the novelty only and this is only the novelty that comes through and I'll talk a little bit more about this process in a second right storage service we don't care a whole lot about it's very much a black box it needs to be a key value store that can support consistent read write this works on top of DynamoDB it works on top of Postgres or any relational database it works on top of Infini span if you want a big memory grid behind that you don't actually ever want disks you put in finis man behind it now you have a big memory grid and basically anything that can support that will eventually support but those are ones we support right now if we look up at up here I obviously have a communications components they can talk to the trans actor they're gonna have this live index in memory and then they have to have some ability if you just ignore the caching for the moment some ability to talk directly to storage and this is the other critical thing right we said well now who has locality to storage nobody actually no one has localities stores in this accept whatever the storage infrastructure this guy's no closer to storage than these guys because there's no advantage to being close to storage so this is just a service or a server somewhere else but once it is that it means that everybody can read directly yes no there's one trans actor I'll talk about that I'll talk about it in a second yeah okay so so everybody has access to storage it also means that where does quarry live query can live anywhere you want right what does quarry need what does the query engine actually need these access to the organized data access to the index since anybody can get access to the index we can put query anywhere we want there's no special machines for query so in the case of des Tomic you end up with this library you put in your java app or JVM app that has a quarry engine built into it and it's got the live index built into and I'll show you what that looks like in a second the other thing that's really critical about this is so we have this storage it's remote it's at least a network hop we said the network hop isn't costing you nearly as much as the disk would if you actually had to touch the desk but it is still a network hop can we alleviate it what what what's what is safe to cache from this the storage but what's safe to cache I mean that's what you might choose to cache but what's what's safely cached everything why it's immutable when does it expire never this is oh these are the things you want to hear as an architect right this is sweet we can cache this stuff relentlessly anywhere we want you can have a local cache you can set up a memcache cluster right and cache stuff there so when guys don't find it in their local cache they can pull it from storage and put it in a cache that a whole bunch of peers could share this is very powerful but what's different about this whose problem is this putting this looking it up in storage and then putting it in the cache if it's not there whose problem is it no it's my problem it's the system problem it's this libraries problem it's not your problem you never see that effectively you ask a query the query tries to find the data it's either got in the cache or it goes and figures out how to get it but it's under this is caching under or caching inside it's not an application-level problem this caching because this caching is mechanical there's no special logic around or anything else right the system can do this caching well for you all you have to do start up memcache and tell it about it yes no no you don't bring anything in right so so this is like this is an app what does it do I don't know maybe it does analytics on pricing so it's going to be interested in a certain portion of this data that's all that's that we're gonna cash this other app this guy underneath him he's the website he's putting up product pages and he's reading a little different kinds of stuff from here this is not a replication of everything it's not even proactive replication of anything in specific right everybody is pulling in their working sets depending on the work they're doing and they don't care they never need to pull in anything else ever now of course being trees they're all gonna have a really nice set of the top part of that tree that's little all cash and they'll get a lot of efficiency for that and in fact it ends up that that diagram where I showed you three tiers that's it it doesn't get any deeper than that which means that once you've cashed the top part of that you can find any piece of information you'd never seen before in one read and potentially that's one read from memcache although some of these storage services like dynamo is really fast so you're only caching it's it's a it's a it's a on demand driven cache that's filled on demand it's filled with your working set so different peers will have different working sets in different amounts of stuff cached ya know this part is rights this is the right path right I have some novelty I sent into a transaction it writes it in here so that's that's the right part this is all read yeah so let's just thought let's just talk through a little bit right so let's just say that everybody understands that indexing takes let's just imagine it takes the log it doesn't actually but let's imagine it takes the log and turns it into the sorted tree so the log is just a pen to pen to pen to pen and the index in here and storage is the sorted tree let's say it last did that an hour ago everything that's happened since it last did that is right here right it came in it said boom it got logged but it's not in the index yet but it got reflected back out so it's here so now you want to answer a question what's I want to put up Sally's profile page and in the last hour Sally told me her email address had changed that fact is here all the facts Sally's told us in the past prior to an hour ago are in here when I want to go and ask a question saying tell me everything you know about Sally because I need to show her profile page I'm gonna do well the query engine is gonna do a merge joint it's gonna say seek to Sally in here seek to Sally in there merge them together right so if there's retractions in here they'll they'll make it seem like the older facts in here are invisible but what you'll get is the sum of information from the two and that looks like now eventually let's say we allow this to build up for an hour and we say now is a good time to index we're gonna make a new tree here it's going to share a lot with the old tree but it will now incorporate the Sally's new email address when that job is done everyone will be informed and this will turn to empty and will start accumulating the next window of change that's how it works yeah yes the live image is actually local in memory to any peer peer that's correct yes every change is broadcast every peer well so the thing is that these are these are servers right you have two keys are these are not some you're gonna have ten thousand of you this this peer is really something like an app server okay that's the way to think of it it in turn is probably serving other people the way your app servers do today right it's not necessarily the web layer it's not necessarily your web tier although ten thousand players in your web tier would be a lot we don't currently have a multicast infrastructure for making that propagation - if you wanted tens of thousands but that would be a way to approach doing that let me keep moving forward and then I'll try to answer questions later it definitely that does happen I just didn't want to draw that line because it makes a little messy the that stuff does come over here but it's it's not the same as what's there it's not it's not organized yet the organization happens here okay let me keep moving and then I'll take questions at the end so process itself it ends up that I said we can boil everything down to assertions and retractions but you can certainly think of transformations that you couldn't express that way like I want to add $10 to your bank account right that's the logical process I want to make but it ends up that the fact that ends up from that process is dependent upon the existing value in your bank account so how do you do that and the way you do that is with transaction functions a transaction function is a function of the database and some arguments and what it yields is transaction data so we said transaction data is assertions and retractions of facts now we're going to say transaction data is assertions and retractions of facts or transaction functions and arguments which means that we can have a transaction function that is add that takes and it will be past the database we say add $10 to Sally's bank account that function will be run inside the transaction it will be given the current value of the database because the database really is a value so this is a real function of a real value it can perform any queries at once including looking up salaries current balance and then will yield more transaction data so if it was that simple a thing it would go that function could go look it up find out the balances 100 say it's 110 and what it would expand into is the fact sally's new balance is 110 but it could do some more involved things I have a picture in in a second so you can imagine any transaction as assertions and retractions and potentially calls to transaction functions those transaction functions can expand into calls to other transaction functions but eventually will expand into assertions and retractions this expansion happens over and over again until it's all assertions there were attractions and this allows you to do any arbitrary transformation on the data in the database you can ask questions you can do anything you want but the cool thing is you have a decision again architectural independence as to when you do this right if you know you're adding pure novelty you don't need to do it inside the transaction you just say I have new facts I know they're new if if you have something where you think you have very little contention you can say well my local view of Sally's bank account is 100 I'm gonna make it 110 and I'll just put that in with a with a condition that says if it wasn't a hundred Wow that's more optimistic but if it was an expensive calculation and the chance of collision was really low that's the more efficient way to do it or finally you can do it this sort of old-school way which is send the function all the way into the transaction which is going to get run atomically inside a transaction with no interference so we call that expansion so this is what the trans actor does it accepts transactions transactions are just data fact fact fact you know a certain a certain certain retract retract and then even if you want have a function it's still expressed as data it says called the you know update salary function with this argument and this this target entity it expands them and eventually will apply to the in-memory view of the database inside the trans actor it will log it right and finally acknowledge the transaction to you and then broadcast it to everybody else every now and then this indexing will occur and as I said in my talk there's the storage equivalent of garbage and garbage collection comes out of doing it this way right because as you make a new index and now no one else is going to care about the old index the root and a bunch of nodes of the old index are junk in storage and so we have to clean that up but the analogy to memory is this it's really good these peer servers have direct access to storage they had their own query engine they have their own live memory index and they do that merge themselves there's also this two-tier cache going on and this is inside right under right when you ask a query if it doesn't have the data it gets it and it will look for a local cache which is at the object level if that is a Miss it will go and look for segments and memcached if you've configured it otherwise it'll go to storage and then it will cache it so I'm not going to talk too much about this because it's sort of involved but if this is just picture of what's happening right there's an in-memory persistent data structure that's the live index it's really immutable in-memory to we're just moving from one to the next there's an infrastructure for allowing you to find the data in storage it goes through the cache and then find a storage and it's trees the roots of which will almost definitely be cached in memory this value is just a pointer it's a struct right that points to these things it's inside a box the only mutable thing in the system is the fact that the contents of this box will move from one of these immutable structures to another whenever we've updated the memory index or whenever an indexing job is completed there's nothing else mutable in the system this one identity which is the database as of right now which means if you've obtained this and you started running a long-running query everything that you've got won't change underneath you even amongst threads in your same process right you're really working with the value okay so what are some of the characteristics well the first thing is if somebody asked about it before what's the you know does it use Lamport clocks or whatever right or vector clocks and one of the cool things about separating perception and process is you now can make independent decisions about availability and scalability for those two things because they're completely separate so the decision that's made by day Tomic because I think it's a market need and has a lot of value to companies is to make a traditional decision about the process side its transactional right it's a consistent view of the world it's a single single writer system the cool thing though is that that trans actor it's not doing anything else it's not serving queries or anything at all it has to do is handle rights okay if you want arbitrary rights scalability you're gonna have to give up transactions and queries and everybody who's adopting no sequel databases that make that choice for you is getting that trade-off and I think it's not a great trade-off for a lot of companies and that's why I'm in this space trying to give them this hybrid solution that's the combination of two on the read side though we get all the benefits of distribution right if you put dynamodb in that storage slot you have arbitrary scaling you know all of all of Amazon's goodness about storing stuff in different places you have knobs bound throughput for reads and writes tremendous service approach to storage or you can just put you know you're already running a sequel database you can leave it there and just start adding data in this format on top of it but it's an independent decision if you do choose a redundant scalable storage subsystem though you will get the benefits of having done that so you get scalable reads and obviously query scale because the queries are not in one box the queries in every peer box so you not only have scalability of that but you have very elastic scalability of that it's not like adding a new box to a cluster configuration this is kind of a big job this this can scale up and down with you know Amazon's auto-scaling write more peers come up and down due to load you have more query capability so it's elastic I'm gonna hurry up through the last couple here you definitely have more flexibility the schema model is extremely small all you define are the attribute definitions your the name and type of an attribute what does cardinality is and things like that there's no higher level thing right the the fundamental unit is a fact the only configurable aspect of the fact is the attribute so that's the only schema that exists and there is schema and the net result is that it feels like whoops it feels like you're programming with multi maps if you take an entity approach to this it feels like every entity has is a is a key value set where each key each value could be multi valued if you wanted so it's very convenient it's very flexible and easy to use programming model down at the bottom the other huge benefit you get from this is you get time right the database is really a value it does include all the past which is you can take any database that you've got and say pretend it's as of last month all right you can just say the database pretend us last month that doesn't do anything it doesn't change stuff it doesn't throw stuff away just as pretend this last month now you can ask queries of that database right when you ask the queries if it finds anything newer than last month it just doesn't it doesn't use it and it can answer those questions which means you can do as of questions against the current database for a past point in time they don't cost anything to do that the other critical thing is how many people ever built systems with timestamps and had to do as of queries it's brutal right because what do you have to do you have to flow around that T that time right if I want to say as of last week last week has to be part of every query has to be part of every joint and it's nasty right you have to max and all that mmm very very tough stuff what did I just say about this I walk up to a database and I say as of last week so I can have a query which is get me the you know sales subtotals that query works against now I can tell the database give me yourself as of last week I can take same query I've run it against that database value it's not parameterised by time anymore right the database knows I'm supposed to be pretending this last week now the same query you can ask it doesn't have any time not times are not part of the joint of course you can also ask quarries where you want to compare times and things like that so you can do as of a point in time you can do Windows since the point in time the other thing you can do because the database is actually a local value right you had that memory component but it is really local in your process you can do as if what if what if I added this stuff what if I committed this transaction what would the database look like what would the answer this query look like do I need to go to the server to do that no I could just take the in-memory data structure which is persistent right I can make a new version of it that includes the transaction data I'm thinking about committing that's now another database value it's all in memory I think but it can can include stuff from storage right but the novelty is the stolen memory now I can take the same query and ask the query I don't need anybody help anybody's help I'm not bothering anybody I can do what if and then I can say well that Cori still works so good now I'll try to commit it or maybe I'm just doing you know what if analysis for people and I never commit it you know we're just trying to make decisions what would happen if we made this decision what would the world look like I don't have to put that into the database to ask to answer the question I still have query capability perception and reaction obviously perception is straightforward right we have this immutable thing all the queries have been flavors of perception but the reaction now is easy because we have this live feed right the trans actor is sending all novelty to all the peers which means get what what's easy to do it's easy to make an event on the peer that says some novelty came in but the other thing that's beautiful about it is you can say here's some novelty you know Sally changed her email address and if it was important you to say wow what is it the same as what it was can you do that sure because the database says the value means that you can capture the database value when that change was made and in fact it gets sent to you with the events as well here's the database before the change here's the data of the change here's the database after another change ask any queries you want and if you want to have sophisticated eventing you just have to query that data right you just say I got this event for you I'm only interested in these certain kinds of changes you just query the feed and now you get that because the query engine is in memory and it operates not only against the database but it operates on memory data structures - combinations of your own memory and what's coming from the database so you can filter that way so I think you get a lot out of this in particular what you get is simplicity in the database the state model is what I would call epical right it only moves from valid consistent point to valid consistent point there's never any in between there's no read coordination or anything else there's only coordination for the process and that's done in the trans actor every time you ask the same query you get the same results you have a stable basis for decision making if you think there was a problem with the system in the middle of the day and you not gonna get to look at it till next week how many people have ever had that happens the queries we're talking some weird results are like 5 o'clock everybody wants to go home so will like look into this tomorrow and then you come back tomorrow and the query is fine because just because more data was added into the system how are you ever going to find out what was wrong you're toast with this it's straightforward right we can say ask that query as if it was five o'clock because I know at five o'clock that Corey was screwy and giving me weird results and you could figure out the answer to your problem because you can get a basis any time you want and the transaction is really well defined what a transaction is is a function of the database value the other things that are important is you can communicate a basis right if I make a decision or I want to give you work to do I have a way to communicate it I'm not going to say go look in the database I'm gonna say there's some work you have to do at database time T and you can go and look at exactly what I was seeing so you know what to do we've seen already architectural e because we've broken stuff apart we have freedom to relocate things we don't care where the queries are owned we don't care where the storage is we don't care where the transact rows running you have a whole bunch of flexibility about where you put things you can put you know data up on DynamoDB and run the whole system in your land and put memcache in your land and isolate yourself from the fact that the Internet's actually potentially in between we saw the time travel stuff and we saw the event processing so the net result of deconstructing the database is that you're able to treat the database as a value in in precisely the way I was talking about in my talk you have a real information model you have a system that's substantially less complex right it's easier to understand it's easier to change right you want to change your storage you can do that it's easy to replace components it's easy to relocate things it's more powerful it's more scalable right you want more brains you start more peers there's less coordination which also leads to scalability aspects and the information model not only lets you remember everything which is important for your businesses because they want to make decisions but also gives you more flexibility because that's a model it's easy to turn into any shape that you want and with that I'll wrap and answer any questions but thanks [Applause] [Music]

Info

Channel: InfoQ

Views: 85,921

Rating: 4.9479289 out of 5

Keywords: database, rich hickey, jaxconf, marakana, techtv, datomic, development, architecture

Id: Cym4TZwTCNU

Channel Id: undefined

Length: 66min 23sec (3983 seconds)

Published: Thu Aug 23 2012