(smooth electronic music) - Hello, everybody, thank
you very much for coming. Good morning. Hope you are doing well. I'd like to talk today
about conflict resolution in distributed systems, that is if several people change some data
independently of each other, what happens, how do we
resolve those conflicts that'll occur. My background is I'm a
researcher at the University of Cambridge. I was previously in industry and a bunch of internet start ups. So I worked at Linkedin
for a couple of years, for example. At the moment, I'm working
on this research product called Trve Data. Spelled T-R-V-E and what we're trying to do here is to bring end to end encryption to a larger range of applications. So think something like Google Docs where several people can edit a document at the same time online but without having to trust Google servers because what we want to do is to be able to put data on
various servers in the cloud but not have to worry about what happens if they get compromised or so on. So that's kind of the
background of all of this. I'm not talking about the encryption and the security protocols today, I'm only focusing on one little
piece of that whole project which is what happens if
several people edit data at the same time and
how do we resolve that. So I'd like to start with a scenario that will probably be familiar with you, which is you, a little
blue stick figure here, are hacking on some code on your computer and at some point you decide
that this code is done and you commit it using your favorite version of control system. I'll just use git as an example here and so at this point, you'd put the code in the repository and then
maybe you push it somewhere so that other people can
see that `code as well. So in the case of Git,
you maybe you'll push it to a repository on Github and this is now the
communication mechanism for people with your team. So if there's somebody else, say, this little red stick figure who is also hacking on code, then well you can synchronize up through the central repository. This is all very familiar, this is what we do everyday and so the little red person here might independently at the same time also be working on the same code base and also do a commit and I'll, what happens if you
this person now fetches from Github, well they'll have to either do a merge or rebase if
that's how your work flow goes or something along those lines. So somehow these changes are going to have to be combined together and as you've probably experienced, if people change different
files in the same repository, that's no problem, they will
just get merged cleanly. If one person changes
the beginning of a file and another person changes
the end of the file, that's probably okay because
the version control system will merge them automatically. If people change the same
part of the same file, then you're going to have to
resolve the merge conflict yourself and then we have these tools for doing three way
mergers for copying patches from one side to another and figuring out what the results should be. So you've probably have to
find with things like this. This is exactly the kind of
problem I'm talking about. But this problem happens not
only in software development. It's a very general purpose problem. So imagine you're a lawyer
working in a law firm and maybe there's a
contract being negotiated with, you've got a client on one side and the other company's
law firm on the other side and everybody is sending these versions of contracts back and forth and the contracts are probably
Microsoft Word documents because that's how lawyers work and they send these things by email and so you've got one
person making changes to these Word documents and then hit save and then at the same
time, maybe somebody else, maybe at the different company is also updating the same document. Changes it and now you email
these changes to each other and so this is actually
very much the same data flow and I actually just
reused the same diagram and changed the labels. You've got the email as
the communication path and at some point these changes are going to have to be merged together and now before Microsoft Word, I'm not sure there is even this kind of nice user interface
for three way mergers. I know you can compare two documents but I think what people end up doing is manually copying the changes from one version of
the document to another and so performing this
merge really manually. So that's kind of this
oh crap situation here. So in this case, it's
kind of best to have like an informal lock where one person says, okay I'm going to be
editing the document now. Please don't change it for the next day. I'll send it to you and
then you can edit it. So people try to sequence their updates like this crude manual communication. What happens in another? Let's look at the third example. Let's look at a to do list. So this is maybe a shared to do list where me and my wife together have this shopping list
where I can add stuff and she can add stuff
and then whoever next goes to the shop can buy those things. So here buying milk is added to the to do list and
let's say this to do list is stored on a central server. Now so thats what allows us to communicate and so I add buy milk to the to do list and press okay button
and so it does a request, like maybe an http post request over the network, stores
it there, it comes back and says okay that was
added to the to do list and at the same time, maybe my wife goes, oh you need to water the
plants, so I just remembered. So add stuff to the to do list also does this post to the server and comes back okay and so in this case,
actually what has happened is if this central server stores its data in a data base and it uses
something like transactions. If this is a relational database say, then we actually have
serialization going on and that is these updates
are actually applied in a sequential order, in a serial order. That's where serializable comes from and database transactions. Which means they're applied one at a time. So you don't actually have
the same concurrency problem as we had with the code editing and with the Word documents
being sent back and forth. Cause actually, there's one
primary copy of the data and that lives on the server and that's being updated one transaction at a time in sequence. So in this case, you don't get this conflict resolution problem. So this seems nice but on the other hand you have a problem which is well, what if I don't have signal
on my mobile phone right now or if the network is interuppted
for some other reason or I can press the button to save and I'll just a spinning,
spinning wait indicator and nothing will happen. So here we have this,
it's kind of obvious, this problem if you don't
have internet connection then you can't reach the central server so you can't store any data. You can't edit the data in anyway. So it doesn't work offline. So these are the problems here. We get the advantage from a central server that we don't have to worry about this concurrency and these different edits happening at the same time and having to merge those together but at the cost of it only works online. So we now we require constant
connectivity to the server. Which is not so great. Especially if you're doing
things on mobile devices. Another problem is that these requests are synchornous, so when
I press the save button, I have to wait until I get
the okay back from the server and only then I notice it
is actually being saved. So until I get the okay back, I don't know whether the network request actually made it through or not. Maybe if the network is unrealiable. So that also can be a timing problem. Let's see in the document editing case, say in the case of Google Docs, you know you can press one letter and then a second later that letter will appear on the screen for another person who's
editing the same document. So there the edits of,
the units of editing is a single key stroke. Its not even like a commit
or something like that. It's just a single letter and so in that case, if
you wanted to send that through a central server than every time you press a letter, you'd have to first
send that to the server, wait for it to be saved there, wait for the server to come back to you, say okay, and then you could display the letter on the screen and so you would have to
wait that network round trip for every single change to the document. Which is actually what happens exactly if you're editing a file over SSH. So if you're logged into a Linux server and you're using Vim or Emax or something on the server and you type a letter then that edit actually
lives on the server. So you've actually got
that synchronus round trip. So if you've got a slow
one, reliable network, and using SSH is pretty painful, as you've probably experienced. Our final problem with putting everything on a central server is that now there's this single point of failure and people make fun of Github for example, every time Github goes down, people say, oh we've got
this nice decentrilized version control system and what do we do? We put everything back
in a centralized service. Isn't that great? So the sync can be disrupted. If you worry about denial
of surface attacks, for example, or if you
worry about blocking, maybe in some countries
that don't have such free internet access, if you're critical of the local regime or
something like that, actually this kind of blockability is quite a problem. So what we'd like to do is to figure out how to
not have these problems of the single central server but at the same time, not have all of the problems of having to do the merges by hand. So coming back to the to do list example, if we think about this having to synchronously having to communicate with a server, that's okay if
the server is in the same town but if the server is on the
other side of the planet, then this is quite slow because it's simply speed of light takes a while to go all the way round to the other side of the
world and back again. So what if we just put
data centers in several different places? So let's say when the
blue person adds buy milk to the to do list, that request just goes to the blue person's local data center, call it data center one
and get saved there, and when the red person adds water plants to the to do list, then that goes to the local data center, too. Those are two different
local data centers. If the people are in different locations, and so now that's okay. So each person can get the
response from their local data center and now these changes will be propagated asynchronously and so the speed of light
going all the way around the planet is still as
slow as it was before. So it could happen that two
people make these changes without knowing about each other and then you simply don't
know which one of these actually came first. Because you know it's not really defined. Did the blue come first or the red one. From the point of view of data center one, first the blue one came and
then the red one came in but from the point of
view of data center two it was first the red
one then the blue one. So if this adding an
item to the to do list means just appending it
to the end of the list, then which order should
these items appear? Now we certainly have
to worry about the fact that the data center one might have buying milk first and
watering the plants second and the other data center
might have them flipped in the other order. Oh dear. Let's think this further. Even without two data centers, we can just make it really extreme and say okay we're not even going to talk to the data center, we're just going to talk to our local storage on the device and I'm going to treat the
storage on my mobile phone as a data center though it's
the best data center ever because I can't have a
network interupption between me and the storage on my phone. So this communication
is always going to work. So I can just store something locally and that will be nice and fast and the red user can
similarly store something locally and then we can just kind of have some kind of blah blah network stuff to synchronize the changes around and if you think about it, this is exactly the same as
the synchronization process that has to happen with
the two data centers, exchanging data and this is exactly the same synchronization process that happened with the Git commits being sent via Github or
with the Word documents being sent by email. It's all exactly the same, which is you've got some
changes which are happening concurrently, independently
from each other and those changes are being
propagated asynchronously and here, of course, now we're back to this problem of conflict. So take the example of adding two items to the to do list. You can kind of imagine
that quite easy to resolve. You just make sure that you somehow decide on a consistent order. Maybe using some Ids or something or some user Ids. You can order them arbtrially but what if one item, one change is deleted to do item and the other actually
edits that same to do item. So the red one changes buying
milk to buying soy milk and the blue one just
deletes the buying milk item and these two changes
happen without knowing about each other, so how do we resolve this? Because if you say that the deletion wins. So, okay this item was deleted, the to do list item was deleted so that edit to change
buy milk to buy soy milk is just gone because the
item that it refers to is no longer there but that means that we've
forgotten about the fact that soy milk was added in there and so maybe that's an important fact that we've now lost. On the other hand, we can do
it the way other way around. We could say that this
buying soy milk wins but when the deletion has been lost and so maybe the deletion is meaningful. So how do we resolve
these kind of conflicts? In general, we have concurrent operations that happen, concurrent
doesn't mean necessarily at the same moment in time. It just means they happen
without knowing about each other and somehow we need to achieve convergence where everybody agrees on
the same data at the end, and so people talk about
eventual consistency which is kind of the data base term that's often used for
describing these types of system where you simply, you can
allow different things to happen concurrently and you just want to end up in the same state at the end. The problem with the
term eventual consistency is that it's actually very vaguely defined and so people are very
imprecise when talking about it and I'd like to break it
down into three points, three properties that
are a bit more precise. So I'm going to firstly assume eventual delivery and
that is we sent messages over the network and we're going to assume that the network is not
interuppted forever. So we're going to assume
that after some finite amount of time, you get a
network connection again and so if you keep
retrying then you can get the message through eventually. You kind of have to make this assumption because if you assume that a device can be offline forever, then it
can never come in synch with the other devices again. Just by definition of offline. So we're just going to assume that eventually some messages go through but we're not going to
make any assumptions about how quick that is. So if I go to Iceland and there's no mobile signal and I'm hiking somewhere up on the top of a volcano for two weeks, then I
won't get any messages for two weeks and when I get back to Reykjavik, then I
will get some messages when I reach an internet connection again. So that means my message delay
is two weeks in that case. That's okay, we're just going
to assume that's alright. Secondly, the second part
of eventual consistency is making sure that everybody
ends up in the same state. That is if we assume
that eventually everybody gets all the messages, then if two people have received the same
messages, then they should be in the same state. So that means even if
they received the messages in a different order, they
should end up in the same state and finally what we want
is to not lose data. Now this, it seems like a
kind of ridiculous point because of course nobody
wants to lose data but it is actually quite
important to make this explicit. So quite a few data
base users use this mode called last writer wins where you say if two
people change the data at the same time, we're just going to pick one of them arbitrally based on the time stamp
as being the winner and the other ones were
just going to throw away and so this is like
equivalent if you think about the word document example of, well we've got two people
editing the Word document and one person emailed
their change to another and the other person is just going to say, well I made changes myself as well and I'm going to declare
that my changes are newer than your changes and so I'm just going to
ignore your changes, sorry, and so this is not very friendly to people because changes get lost. So let's require not losing data as well. So this is what I mean
with eventual consistency and this is the kind of data structure to which we might want to apply it. So I'm going to model this to do list now as a JSON document and
you can just imagine it's like a list of to do items which is like each to do item has a title and the flag done,
whether it's done or not, true or false and then
maybe there's some settings and stuff as well on the side, and so the main data
structures that we have here are an ordered list. That is you've got a list of to do items and they have to appear in a certain order and the user specifies this order and also you have maps. So you've got, maybe one adjacent object inside another adjacent
object or something like that. Now once we've got these data structures we can then make various changes to them. So as the user interact
with the application, they will make changes. For example, they might set
watering plants to true. So they say, okay this is now done, I'm going to check the box on my phone and what that does is set this done flag from false to true. So the operation here,
the change operation, the edit operation, is assigning a value to a particular field
in this JSON document. Another thing that might happen is somebody might edit a string. So change milk, change
buy milk to buy soy milk. So by inserting a few
letters into that string, another thing that might, people might do is insert a whole new list item of phone mum between buy milk
and before watering water and plants. So this is editing the ordered list object to insert a new item. Another thing that people might do is add another key to this map here or they might even delete an entire entry. So the top level to do item,
just delete the entire list. Why not? Maybe you have several lists
or something like that. So these are the kinds
of changes that people can make to these documents and what we want to do is
have some way of resolving those changes. So that even if people make
these changes concurrently to each other, we end up
with eventual consistency. So we can model this document as a tree and we can have some data
type annotations on it. So say the top level document is a map and inside it we have a
list under the key to do and so on. Won't go into too much detail. So this is actually an
algorithm that I developed with a colleague and we
wrote this paper about it a few months ago. So if you're interested in
that, you can find it online. It's called a Conflict-Free
Replicated JSON Datatype. Has anyone here heard of CRDTS before? Okay a few people, cool. So this is an example of a CRDT. If you don't know what
a CRDT is, don't worry. This paper is very theoretical . It looks inside, it
looks kind of like this. So I'm not actually going to run through all of the operational semantics today, so don't worry about that. I'm just going to kind give the intuition about how the algorithm works and just show kind of some
of the curious edge cases that occur there. Some of the stuff that
we have to think about when trying to do this. So the hope is that this algorithm would allow people to concurrently edit JSON documents and merge those edits and end up with a sensible
result at the end. So one example of a document, we saw the to do list example, another one might be
simply a text document. So. So a text document
consists of the file name, which is just a string. Consists of characters or of the body, of the actual text. So each individual characters. Like the smallest unit you can edit and like maybe there
might be additional stuff for formatting, setting fonts and so on, I'm just going to leave all of that out. Another thing that's quite
useful is cursor positions. So if you use Google
Docs, you can see where in the document another
user is editing right now. That's quite handy to see
so that you don't change the same place at the same
time while you're online. So you could imagine
implementing that as a map from each client has a position. A position is like a
location in the document and so that would allow you to keep track of other people's cursors. So I'll just focus on the
body characters for now. So the ordered list of characters is what constitutes the
content of the document. You might wonder, why this is, why this is a list of strings and not a list of characters. So begin footnotes. It's a list of strings because in Unicode, if you actually
represent like one character, the smallest editable unit of the document is not necessarily a
single Unicode code point. Because you have things like combining, combining marks, I think
they're called for accents, and diacritics and various things or for emoji, I think
the skin color annotation is a combining mark and so
you end up with a sequence of several Unicode code
points that constitutes an character from user points of view. If you're interested in this,
look at the Unicode annex number 29. Anyway, end of footnote. This is completely irrelevant. I just wanted to mention that as well. I'll give a little demo of the text editor we implemented because
its just kind of fun to make this a bit more interactive. So what I have here is
something that doesn't work. Sorry. This is always the thing with demos. Maybe it has to be on the wifi. Okay, let's try again. Okay, good. So. So this is a very basic text editor that we implemented using
this data structure, this algorithm that we developed and I've got actually two
windows here, left and right, which are both instances
of this text editor running and I can say hello to
Berlin and it works. So you can see I can type
something on one side and it appears on the other side and these two editors are
actually communicating via a network connection. So they are two separate processes that are otherwise don't share anything. So they could easily be
on two separate computers on other sides of the world, no problem and what I can now do is
can I kill this server. So they use a web socket
server here to communicate and when I kill the server, what I'm simulating is
a network interupption between the two. So now both editors are offline. So this could be because
actually it's a server outage or just because the clients
have lost their internet connection, it doesn't really matter and so I can keep editing offline. So lets say here. So I'll say, Hello everyone at GOTO Berlin and you see it doesn't appear
on the right hand side, because they're offline. So we have these two now and so I'm going to restart the server now and what we want is for
all of those changes to be preserved. So make sure, look at that
everyone at is still there and hope you're all doing
well is still there. So when I restart the server, the editors in the background keep trying to reconnect to the server automatically and keep retrying in the background and eventually they will mange to connect and resychronize and
now you see everyone at has been copied from the left to the right and the hope you're all doing well has been copied from the right to the left and we didn't have any
like three way merge user interface for this. Like it just did this automatically. Now, let's have a look at how this algorithm actually works because that's kind of interesting. This is basically the same as
what Google Docs does, right? So, Google Docs is a bit fussy with its offline support but if you force it
into actually doing it, it essentially does the same kind of merge and I can run a bit of how the algorithm in Google Docs work. So imagine you have a document consisting of the letters H E L and O and you label each letter with the index of what position it is. So 0, 1, 2, 3 and on the left hand side we've got the green editor and on the right hand side,
we've got the purple editor and each editor now makes an edit to this. So the left hand side
inserts a second L character to change it to Hello and the right hand side
inserts and exclamation mark at the end, making it
H-E-L-O-exclamation mark and so now we want to
merge these two changes that have happened concurrently and so in order to do that, we have the server to, this is, in this case, run by Google and these clients send essentially a dif or like an operation
recording what the change, what change has been made by the user and sent that operation over to the server and so the left hand side says insert L at position three. So you see 0,1,2,3,4 so we inserted the L so the O has moved from position three to position four and so we inserted that L there and so this side, we have the
inserting exclamation mark at the position four and so that describes
the change that was made and now server forwards those changes over to the other client and
so the green change, inserting the L gets forward
by the server over here. Insert L at position three, position three is the right position here. So you end up with Hello exclamation mark. Which is what we wanted. However, if you think about what happens in the other direction. So this insert exclamation
mark at position four, if we simply send that through unchanged, what we will get is
Hell exclamation mark O because over here, we
inserted the letter L at position three so the O moved along from three to four. So really we would need to change that to insert exclamation
mark at position five and only then we would
get the right outcome of having Hello exclamation mark and so what has to happen here is that this position four needs to
be changed to position five because there was concurrently an insert at position three. So the server has to keep track of all of these things
going on simultaneously and it has to transform the messages, rewriting this three to four. Some of the transformation
happens on the client as well but this algorithm does actually depend on the server to do some work as well and that works. So this algorithm is called
operational transformation and it's been around for quite awhile. So it was first discussed
in the academic literature back in the 1980s. Although the first paper that presented this algorithm was actually incorrect and they said so in the paper. They said, heres a case in
which our algorithm fails. Can someone help us fix it please? And then several people,
researchers came along and proposed fixes to it, which worked, and there were several different competing algorithms there. The one that most of the, well done operational
transformation based systems are kind of inherit from is called Jupiter from 1995 and that's what like Google Docs and Ethopad is based off and Google Wave which is now Apache Wave. They all use this Jupiter design which is to use a central server that does some of the transformations. Some of the others don't
use a central server but instead keep track of some kind of multidimensonal hypercubes of all of the edits
happening simultaneously. They start requiring
quite a lot of memory, some of those algorithms. Which is why they're not
used as often in practice and so this kind of works
fine for Google's purposes but remember what we wanted is to be able to work within enter and encryption and so in that case we can't have a server transforming our messages because the server would then have to see the content of the messages and we want what we want is, well we want to avoid one central server because that's a single point of failure and we want to avoid that
server seeing the content of our messages. So we want to be able to
just forward the messages and this is where CRDT's come in. So CRDT stands for communative, no sorry, conflict free
replicated data types which is a bit of a mouthful which is people just say CRDTS and essentially this is a
family of data structures where several nodes
can concurrently change the data and they could
automatically merge and so they've got, as
part of the definition of the data structure,
they've got merge functions or functions which allow
you to apply operations in a different order
and still get the same outcome at the end and if we want to model something like a text document like what I had in our editor example, in that case we have an
ordered list of characters and so the data type we want here is an ordered list and several different algorithms were, have been
proposed to that (mumbles), more recent about in the last 10 years these things have come up and the one that I'll describe now and that our text editor has based off is called RGA, replicated growable array. Which came out of a Korean
research group in 2011. So let me show you how this one works, in a nutshell. We start off with the
same example document which is H-E-L-O but now instead of giving each letter just an index, 0,1,2,3 I'm going to give each
a unique identifier. Each letter has a unique identifier which might be just 0a, 1a, 2a, 3a and we now have the same
edits happen on the left side, we insert the letter L and on the right hand side,
we insert the exclamation mark and every time we insert a new letter, we have to make up a new
identifier for that letter and we're going to have a little rule for how we create new identifiers. So we have, they have
to be globally unique and they also have the
certain ordering property and so we're going to
construct them as follows. Each identifier is a number and a letter and for the number, we chose one greater than the highest number we
have so far in the document and so the document so far
contains 0 to three as numbers so the next number we're
going to pick is four and then we call the
left hand one is node A, the right hand one is Node B and so that will be the letter that we use for the identifier and so if we assume that the names of the nodes are unique. So there's no other node
called A, there's only one node called A and the, furthermore we assume that these numbers are generated by taking one plus the highest we have. So the numbers per node
will always be unique. So that means we generate
4a on the left hand side and 4b on the right hand side. So it's the same four
because they're both in the same starting position but they
have two different node Ids so we get two different identifiers and so that works and now we can send
these things via server or it doesn't even have to be a server that goes via P2P network or
anything else you like as well. So I'll just use a server
here for simplicity and instead of inserting
at the particular position, we're now going to say insert L and we're going to say the
idea of the new letter, which is 4a and we're going to say where to insert it by
saying after position 2a and so 2a is the first L and so the new L with 4a get's inserted after 2a and on the other side, we
insert the exclamation mark after 4, after 3a. 3a is the letter O and so inserting the exclamation mark
with ID for b after 3a means put the exclamation mark after the O and so, we can now forward these messages. So insert L with 4a after 2a. 2a here still means the first L, so it's going to put the
second L in the right place and give it ID 4a and on the other side,
we can take this message and apply that. Insert exclamation mark after 3a. 3a is still the letter O even though we inserted
the letter L before it. It hasn't changed the fact that the ID of the letter O is still 3a and so we can put the exclamation mark with 4b after the letter O, after 3a and so we end
up with the same document in both places which is very nice. We can now just have some
kind of network between these two. It doesn't have to transform
the messages in anyway, it just needs to make
sure it keeps retrying and eventually the messages get through. That's all we need here. We can encrypt them
and it's all very nice. Now there's still a problem
with this algorithm, which I wonder if anyone can spot? Yes? Deletion is one thing. Yes I, deletion is actually quite easy. So what we do for deletion
is just to set a flag for every ID and say that if it's deleted then the ID actually remains
there, secretly hidden, but we're just going to say
okay, this O is gone now, so don't show it in the user interface. That's called a tombstone. There's not. Same position, exactly. So what if two people
insert at the same position? So let's have a simpler
example document here. A,b,c, with Ids 1a, 2a, 3a, and now let's say X
and Y get inserted here between A and B. So the left hand inserts X and Y, gives them two new Ids
according to the rules that I specified earlier. So 4a and 5a and and so what that means
here is insert X after 1a, insert Y after 4a and on the other side, on the right hand side,
I'm going to insert P and Q and we'll have the same
rule again for assigning Ids so they'll get 4b and 5b as their Ids and they also appear between A and B and now we need to make sure
that all of these letters end up in some kind of consistent order on both nodes afterwards and so lets take those
operations and apply them to the other side. So insert P with the new Id 4b after 1a. 1a is the first letter
A, so we insert P there and then insert Q with letter, with id 5b after 4b. 4b was the letter Q, which
we've already inserted, so we can put the, with the letter P. 4b with the P. So the Q with 5b goes after the 4b so we put A, P, Q, X, Y, B, C. So now, okay that's fine. So far now, we need to make
sure that on the other side, we're actually going to
end up with the same thing. So we need to make sure the X ends up here after the P Q but before the B because we put the X here, then the two would be inconsistent. So, well how do we do this now? Because the message is
going to be the same. The message is just going to be, insert X with ID 4a after 1a but if we put it straight after 1a, then the X will go here and we have an inconsistency. So I'm gong to introduce a small algorithm which uses these Ids and that is, look at the Id of
the incoming operation here. When you're applying an
operation that came in from a different node,
in this case that's 4a. So 4a is the incoming operation and we first find the position after which we want to insert. So 1a is here. So we start here, after the A and we look at the next,
the ID of the next element in that list. Which is 4b and if that ID here, 4b, is greater than the incoming element, ID 4a, then we're going to skip over it and if it's greater, the next one is greater again, we'll
again skip over that and we keep skipping
until we find an element that is less than the ID
of the incoming operation. So 2a is less than 4a. So in this ordering here,
we look at first the number than the letter. So here 4b is greater than 4a
because B is greater than A. 5b is definitely greater than 4a because five is greater than four but two is less than four so we know now we have to insert between the 5b and the 2a. Whew and then for the final
operation is actually easy. So this insert Y with Id 5a after 4a while 4a is here. So 5a just goes after it, that works, and if you're wondering the skipping here doesn't apply on this side because here the 4b is greater than 4a so we inserted this 4b
here, we didn't skip over 4a because 4b is greater than 4a and so this actually works and you kind of have to actually prove it mathematically to convince yourself this really works in all cases but we have proved it and
so we actually believed it and so hoorah, we now have an algorithm which will allow us to make these changes concurrently without any transformation, without any coordination and just allow us to end up in the same place. So let's say our document is Hey guys and we decided that is
not gender neutral enough, so one person is going to
change it to hey everyone and the other person is going to change it to hey folks and what does our
editor do in this case now? So I described just now how deletion works and so deletion just means
setting a flag on these so if two people are concurrently delete the same letter, they
both delete the G of guys, well that deleting it once is
the same as deleting twice, it doesn't get anymore deleted through being deleted several times. So those letters just get deleted and the insertion stay because insertion doesn't
replace another insertion. So as these two merge, what
you're going to end up with is the two insertion concatinated, so it will be hey everyone folks or maybe it will be hey folks everyone. So the order of those two is arbitrary, that just depends on coincidence of how the networking happens to work out and you know there's no order that is any, one order isn't any better than the other. The problem is just
that what we have here, everyone folks is not an English word and so you kind of end up
with junk in the document. Though this is actually what
Google Docs does as well. So at this point I'm willing to just say, okay we'll tell the users that look here, this doesn't pass the spell checker and it maybe can pop up a warning saying hey several people
edited the document at the same place. Actually Google Docs doesn't
even do this warning. It just leaves humans
to spot it by themselves and it seems to work in practice so I think at this point we just say okay, it's good enough. We're not going to try to do a grammatical analyze of the sentences and
do the merges automatically so that they obey English grammar because that's just not going to work and people have tried that and it simply doesn't work. So we've got this CRDT,
this data structure for all the lists which
several people can change at the same time. Described deletion and how that works. There's still a lot of
open questions there. So the problem with deletion here is we need to actually
remember for a long time afterwards where a particular ID is because we use these Ids
as a position in the list like address like a pointer and so we can't just delete, if something gets deleted
from the document, we can't then just remove
it and forget about it entirely because then some
operation would come in and insert after 500 and 3a and we go 500 and 3a,
sorry that was deleted. I don't know where we put that insertion. So we have to keep those things which are called tombstones. Also haven't talked about how to do undo, like you know, control zed type of undo or how to reorder the
elements in the list. That's kind of interesting or how to make all of these Ids efficient so it doesn't take too much storage. So still a lot of open questions there. I think the basic
algorithm is kind of fine but actually putting it into practice is still going to need a bit of work but I want to talk about some
other stuff as well, actually so we were talking about
these JSON documents earlier and so interesting things happened when several people concurrently
change JSON documents that don't happen just in
the case of a text document. So let's say we've got here, it's a bit of a contrived
example I'm afraid, a map of colors and so this is, imagine this is an actual structure. So it's not like curly
brace qoutation mark CO etc as a text document but it's
actually the JSON syntax tree. Described here and so one
person inserts a new color, the color red, into this map and so you would then end up with a map containing blue and red and another person concurrently decides actually they want to wipe
out the entire contents of the map so it sets it to the empty map and then adds green into the map and so how do we now merge
these concurrent changes that have happened here? What happens, what's the outcome, what is even the right behavior here? What do we expect? So we can think about it systematically and say, well okay blue,
does it contain blue? Well blue was deleted
on the right hand side by setting the whole map to empty so I guess blue should
not be in the final result cause it was deleted and the
left hand side didn't touch it. What about red? Red intially didn't exist but
it was inserted on the right, on the left hand side
and the right hand side didn't touch red so I guess we need red to be in the final result. What about green? Green was inserted on the right hand side and it was not touched
on the left hand side, so yes we want green to
be in the final results. So our final expected outcome
is to contain red and green but not blue. That seem logical? I guess so. So we can do that and
our algorithm does this. What this means here is that we have to keep track of what this
setting the map to empty means because if you just take this and say, on the left hand client, you first add the red in and
then you take that change that came from the right hand side which is setting the map to empty and if you just apply that naively and say you've got an operation that has sent the map to empty, well then you're going to wipe out the red because the red was added concurrently. So we have to somehow remember what the state of the map was at the time when you set it to empty. So we have here, the thing
is known as a causal context or something like that. In databases like react or example, has anyone used react? They have this data types feature, okay. It works very similair
to this that you have this little extra bit of
information that you pass around in a HTP header and that, the purpose of that is to keep track of what is the state, what was
the state of the document at this time when you
saw it so that then later you know what the
changes need to apply to. So let's look at a different example. Here we've got again an to do list example and we've got buying milk
and it is not yet done and so on the left hand
side an edit happens and it just deletes buy milk from the list because maybe it's been done or maybe we no longer
need milk or whatever, it's just removed from the list. On the right hand side,
somebody sets done to true. So taps the button to set
buying milk done to true. What happens if we try to merge
these two different updates? We can apply the same
logic as we have just now with the colors with the nested maps. So with the colors what we said is okay, does it contain each
of the different things depending on who edited what? Okay so the title of buying milk, on the left hand side that was deleted because we deleted the entire to do item. So I guess the title should
not be in the result. On the right hand side,
we didn't touch the title. So that's got to be deleted. What about done false? Well done false was deleted
on the left hand side and over written on the right
hand side with done true. So I guess done false is out. What about done true? Done true did not appear
on the left hand side. On the right hand side,
done true was added so we want done true to
be in the end result. So if we apply exactly
the same reasoning logic what we end up with is a to do item that contains the flag
done true but no title and this is kind of not what you expect because like we've applied
exactly the same logic but what we've ended up with
something is that looks weird and it looks weird because
we kind of have this implicit scheme idea we expected to do item to always have a title and the done field but this merge here has essentially tampered with our schema. So what do we do here? Maybe we need a schema
to explicitly say exactly what fields something must have but in that case what do we do? So if somebody changes, if
one person changes a field within an object that has a schema and another person deletes
that entire object, well do we say the deletion wins? So in that case, any changes to stuff within the deleted item are just going to be lost but we said earlier we
don't want to lose data so how do we resolve this? Well maybe we say, actually the, if somebody concurrently changes this and sets done to true then we actually going to
bring the title back as well. So even though the title
was actually deleted on the left hand side,
we kind of ressurect this whole deleted item and say, okay it's gonna both
the title and one true but then well we forgot about the fact that somebody deleted this item. So it seems like something
has got to give here and we don't really know what it is. We could say that maybe
overwriting something with an empty map should
not have the same semantics as deleting it and re adding it again or maybe it should have
the same semantics. So I'm afraid this is going to end on a slightly depressing note which is, like we simply don't know how to expose these kind of concurrently
editable data structures to application in a way that
is not horrendously computing and so I think that, I
think there's a lot of value in having these kind of data structures that you can just merge automatically and not have to worry about writing manual conflict resolution code but at the same time,
conccurenacy still is hard, even if you abstract away
the concurrent communication and everything, you've
still got the problem that somehow you can sometimes end up in these merge situations
where's there no one right way of doing it and so if
any of you have ideas, I'd love to hear them. Otherwise, we're
continuing to work on this and just try different approaches, see what kind of APIs
make sense to developers. We could go back to the old bad days and say okay, we're
just going to have this, this oh crap scenario, we're just going to let users resolve all of the edits manually. I don't think that's
really friendly to users and so I think that like
the popularity of things like Google Docs and indeed like, Meld or things like that
for conflict resolution get, shows that I actually, people do need a bit of tooling and help in order to, in order to resolve conflict. We could say, okay well
just put everything on a central server and
serialize everything. That's an option too but again, we've got this problem that you require a network communication all the time and if stuff doesn't work offline, which is also a shame. So I think it is worth
working on this problem but it is an open problem. If you're interested in
more details on this, I've put, tweeted the slides already so you can find it there and there are links to all of the papers. Heres the second page
and here is a third page of references. So our paper is one of those and there are also several others about different CRTDs
and different operational transformation functions. So that goes into vast amounts of detail. Here is a book that I've been writing that I've just sent off to the publishers just two days ago. So you an get an early release
of this online already. This is, it's not
specifically about merging, it's very broad, kind of introduction to the architecture of databases and what do databases
do under the hood, also. So if you're interested
in this sort of thing, I'd love if you could check it out. Maybe give me any feedback if you have any thoughts about it. That would be wonderful. Thank you very much all for coming and I hope you have a great
rest of the conference. (applause) Do we have a minute or two for questions? I'm a bit short of time. We do? Okay, does anyone have any questions or comments or anything? We have some on the app. That's alright. So the question is, "Apart
from the operational transformation algorithm, is there any algorithm which
uses weighted operators to decide the last operation? For instance, the delete operation has the lowest impact for that loses. In terms of weight maybe, neural network probabilities
making decision faster?" So you can used weighting except that doesn't fundamentally
resolve the problem. Which is that you can end up then in some kind of scenarios
where you've just got an impasse and you've got two things with the same weighting and
one of them has got to win and so you can then arbitraily decide whether one thing should win over another and so that's what I was getting at with kind of these trade offs here of like, you could specify maybe in a schema maybe some kind of semantic annotation saying we want delete to win or we want update to win. The problem there is that
it's really hard, I think, to communicate that to developers of like what does that actually mean? If you don't have PHD
in distributed systems, will you still be able to understand what this flag in your
schema actually means? So I think just making
stuff comphrensiable in a way that like, somebody can just go and build an app and they don't have to know about all of the internal details of how conflict resolved internally. I think would be very good and so maybe priorities
is one way of doing that but I'm not sure. The other question just there was, "How did I make my slides?" Which is using an iPad,
I draw them by hand using a iPad app called Paper by a company called 53. - [Man] I think you should wrap up. - Okay, thank you very much for coming. (applause)