How to make a real-time collaborative text  editor in 5 easy steps! // Rudi Chen

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let me know if this situation sounds familiar to you you're in some class and you've been given an assignment maybe it's a research paper maybe it's a presentation so you write a draft and you send it to your partners and then you revise the draft sending it to a partner again finish it up send it apart your partner again make some last-minute changes sent it to partner again promise this is going to be the last version senator your partner's again oh wait while you were writing the seriously I promise this final version your partner fixed a couple of typos so now you have two versions you have to merge them you miss some of their typos you lose a few marks everyone is sad but fortunately we have solutions to these sort of problems now among the coolest cloud apps have come out in the recent years are apps like Google Docs and roebucks paper where multiple people can edit the same document at the same time if you've ever used one you probably know how incredibly convenient they are you can brainstorm at the same time can work on different or the same bit so the project at the same time you can fix other people's typos and to support that these applications need a way to resolve conflicts that happened when two users change the same text to different things at the same time these applications also enable having a single source of truth which means that there's only ever one version of the document alive at any given time so you don't have to chase down all your teammates rush in the document and that also requires a way to resolve editing conflicts so the goal of this talk is to get is to give you an idea of how these algorithms for conflict resolution might look like and first let's just start by looking at an example to see what might make this a hard problem to solve let's say you and I are working on the title of this talk it's definitely going to include build a text editor and we might have our cursors at different places because we want to change different things I want to add your day cool text editor and you want to write how to build a text editor if we have a very naive way of synchronizing these changes we might to do something like this I'll send you the change insert cool at column 8 and you can send me insert how to @ : 0 but originally when I've wrote cool column eight-pointed after the letter A but when you receive the message column 8 points after the letter B because you've answered this something before it so when you apply the change you'll get some gibberish like how to cool you'd eat text editor and everyone is sad again so to resolve these kind of editing conflicts I'm going to present a class of algorithms called conflict free replicated data types or CR DTS for short it is not what Google Docs and draw box paper use they use something called optional trans operational transform but that believe that CRE T's are quite a bit easier to understand especially in ten minutes and they will generalize better to things other than text documents so the first step will need to build a CRE T is T to understand the importance of commutative operations so earlier we got an editing conflict because our operations did not commute inserting a column zero then column eight is not the same thing as inserting as column eight then column zero but let's imagine and let's not worry about about how for now let's just imagine that every single operation that the user did was cumulative so the user makes a bunch of changes and I can rearrange them in any order and I'll still get the same result in the end now if this is true for all users can you see how this might make synchronization really easy because two users can be doing completely different things separately send each other their operations at a later time and because the order in which operations get applied doesn't matter each user can be applying their operations and the other person's operations in a different order and you'll get the same result on every client and you get something very consistent a very trivial example of a CR ET that has this property would be an ad only set it doesn't matter in which order you add things to set it's always the same set in the end now technically you don't need operation to be commutative if you only need them to be so sort of interleaver ball but I like to think to use the word commutative because it's easier to think about and the second property we'll need is for operations to be idempotent sounds like a bit big word but it only means that if I apply the same option operation twice it's the same as applying it only once this is useful for example with two users delete the same character at the same time this should only cause one character to be deleted so that's cool but how are we going to design these commutative operations so first let's look at how we could implement a synchronized back data structure a bag is like a set but you can have multiple copies of the same item like a bag of Kiwis for example there's all these qyz are the same but you can have multiple Kiwis and this bag should have two operations add a QE and remove a Kiwi so let's say we start with a bag that has one key way in it and there's three people editing this bag each of their own local copies two of them decide to remove it Kiwi and one of them decides to add a new Kiwi first we apply the operation locally and then send each other the messages of what we just did now what do you think the end result should be probably only one Kiwi in the back right like we had one Kiwi we removed it but we added a new different Kiwi so we have the new Kiwi in the bag in the end but in this situation what will happen is that the bottom Kiwi the bottom bag will have no Kiwis in it because it received two removed Kiwi operations and the top two bags will either have zero or one QE depending on whether the ad message or the remove message arrived first so we got an inconsistent state on different clients they see different things that's not very good but we could change a little bit how our operations at design we could change the add operation to not only add a Kiwi to the bag but give the newly added Kiwi a unique ID and we're going to generate this ID in such a way that no client has ever used this ID before ever and the remove operation can take not just a Kiwi to remove but a particular QE to remove with a particular ID now if I do the same thing as before removing two bags and add in one bag I'll find that after sending out the messages I'll have the second Kiwi and every bag the bottom bag received to remove Kiwi messages but it was but both both requested to remove TV Kiwi number one and Kiwi number two was never touched and different add and remove operations are never gonna conflict because they're operating on completely different elements so that's a neat trick but you didn't really come here to hear me talk about grocery shopping right you want to know about text editing but okay I mean grocery bag isn't too different than a text document I mean if you just replace the Kiwis in the bag with alphabet cereals you got more or less the same thing the only difference is usually when you have a text document it's implied that the characters come in some order so what we're gonna do is we're still gonna use a bag data structure to store the characters in our text and of course you can have multiple of the same character in a text and each of them are gonna have their own unique IDs but the order of the character isn't in the text is going to be implicitly determined by the order of the IDs that is to say the order of the characters in the text is the same the order in which the IDS can you just sort the characters by their ID and you get the text so again that's neat but like in this case how do you insert between two characters like I don't know if if any of you know this but I misspelled the word occurrence it's supposed to take two hours so how do I add the missing are the one R is ID five and the e is ID six so how to insert in between well the trick is pretty simple you create a fraction if you don't have room between five and six you just make an ID with 5.5 so that's pretty simple but we do want to generate unique IDs and in this case that doesn't work because if you if two different people who decide to fix this typo and insert a character are at the same time they'll both generate IDs 5.5 and we want to the IDS to be different so the last step is going to be to combine fractional indexing with unique ID generation let's say we have two users editing the same text a duck and a turkey because it's boring if it's always humans plus there was no gusta mucho yet so a duck it is the duck types are water and that's straightforward you generate ideas one to six but they're also going to tag these IDs with the site ID of the duck and then when the turkey wants to add something like Lu there's again no room between ID five and six that's all right you can just create IDs 5.1 5.2 5.3 5.4 but the new digit of this these IDs that which represent fractions is going to be tagged with the site ID the turkey because the turkey inserted those characters and if the dog decides to troll and add a hunk it's gonna run out of space again so we're gonna create a new ID with a third digit and that's going to be tagged with the ID of the duck again the last edge case we want to solve if to generate unique IDs on character creation is if the same user deletes and reinsert the same character even with the site IDs that could still generate the same fraction so we're just gonna add a timestamp or an incrementing counter on every client and that solves that problem so now whenever we typed in your character we're always going to generate a unique unique ID which is going to consist of a timestamp plus a list of pairs of digits and site IDs the leading character still works as in the bag of Kiwis instead of giving it a character to delete or the position of a character to delete you're going to give it the ID of a character to delete so now every operation always operates on unique elements therefore operation operations are commutative and as we discuss at the very beginning since operations are commutative then synchronization becomes trivial and that's pretty much it I mean there's a bunch of implementation details which I haven't gone into and I'll link to my blog post which goes into more of these details but what I hope this talk was helpful in doing is to giving into an intuition on how you might design operations so that you can have a collaborative editor text or otherwise and we do it by designing operation in such a way that they never conflict at all mixed response type projects could be used to at work and I hope you found this talk useful this is I put my contact information on the science my name is Rudy feel free to check out my blog for a more detailed write-up of this technique and thank you for your time you
Info
Channel: StarCon KW
Views: 11,204
Rating: undefined out of 5
Keywords:
Id: jIR0Ngov7vo
Channel Id: undefined
Length: 13min 2sec (782 seconds)
Published: Thu May 17 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.