OSCON 2016: Dissecting Git's Guts - Git Internals - Emily Xie

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so just a quick and true about me my name is Emily and I am a programmer I used to work at Wayfarer where I was a software engineer on the dev tools team and there I developed on several of our in-house tools that hooked into git and now I'm currently at the recurse Center where I'm pursuing some personal code projects involving algorithmic art and data visualization so for this workshop dissecting gets guts will be doing a deep dive into how git works under the hood but before we begin I wanted to get a show of hands how many of you here use git on a daily basis yeah most if not all of you I think I saw all hands go up awesome so I really like teaching this topic because programmers use this tool all the time but most have a rather superficial understanding of how it actually works xkcd makes a pretty funny joke out of this of how for most people get is complete sorcery and because of that they're kind of left in the dark once they screw something up right so how many of you have had something go terribly wrong and get yeah I believe there's a term for that it happens fortunately knowing what goes on under the surface empowers you as it leads to a better intuition and thus a better ability to navigate the system in case something goes horribly awry or even in everyday use of the program so that's what I hope you walk away from from this talk with a more comfort in using it and a better conceptual understanding of the system for example what is at the heart of a branch or what is happening when you do a git checkout or what is a detached head State and the way I'll approach this talk is with the assumption that you're familiar with get porcelain commands so this is the term for all of the high-level commands that end users interface with on daily basis so you guys should know these shout out some examples of these yes clone pull what else push add status fetch branch yeah tons right so while we're on the topic of terminology on the flip side you have something called get plumbing and we're not talking about toilets here so plumbing is the term for all of the low level commands that allow you to manipulate inspect or compare the basic structures of get and we'll use these plumbing commands to sort of poke around and dissect how git is fundamentally structured and this is how we're gonna do it first we're going to walk through the concept of the dot git folder which is where the git magic happens and then we're going to drill in and look at the objects directory which is where get store stuff and we'll later move on to the refs folder to explore how get aliases things and lastly we'll take a look at get PAC files which is how git save space and we'll look at how that ties into the system's general structure does that sound good great so the format of this talk is workshop style which means that I'll lecture for a little bit and then once we're at a good stopping point I'll give you some time to explore the concepts we've just learned through a quick exercise and while I'm in lecture mode that is while I'm talking like right now I simply ask that you just pay attention to the screen and you can feel free to take notes but whatever you do I really don't recommend that you sort of code along with me and copy what's on my screen while I'm talking since well you'll miss out on some very key points especially since I'll be moving fast and especially since the exercises will go over what I just covered and each of these exercises come with a cheat sheet for you to Prince and use as a guide so this is the colorful PDF file that I had you download before the class began did everyone download that oh that's totally fine so if you haven't downloaded it no fear like when the exercises begin you'll have a chance to and there is like a link right on the slides themselves so don't worry whatsoever cool so also during these exercise and breakout sessions we have two TAS on hand to help you out ed Thompson wave hi and Nick Platt wave hi so feel free to call on any one of us and lastly I know this tutorial is slated to last three hours that seems like a super long time too for anyone to sit through anything so I'm aiming for all of us to get done or in close to an hour and a half inch or two around two hours sounds good great so hopefully everything makes sense so far so for those of you who are just sort of streaming in no worries we've only just gone through the introduction section so this the rest of this tutor will still make sense okay great so let's walk oh whoops I forgot that slide so let's walk through the get internals from the ground up starting with what makes a git repository okay so I'm just gonna go ahead and I'm gonna create a folder and I'm gonna go into this folder and I want to run a number of git commands on it like git log and I have git installed on my machine so if I were to run this right now at this very moment would this work no yeah that's correct the answer's no but let's go ahead and try it just for demonstration purposes great so I get a fatal error that's because before you can perform project specific get operations on a folder you actually have to initialize get and the way you do that is through a porcelain command does anybody know this command by the way get in it great awesome so get in it puts in place the scaffolding that get needs to operate on that project great so now it says that there's a git folder that's been created but if you run an LS on it you don't actually see it but actually let's do an LS la which shows the hidden folders and files in the directory l standing for list format and to sort of make it pretty and a to show all the files and there you go right now you see it so the dot git folder is what makes a directory a git repository it's where everything that gets stores for a given project lives and because of this if you ever wanted to backup or duplicate a git project you can simply just copy over this hidden folder and you have all of the history intact in fact when you run a git clone for a repo from a remote from a remote source like github or git lab that is essentially all that you are doing copying over this dot git folder ok so now it makes sense that the dot git folder should be hidden to begin with right if this folder contains all of the vitals you typically don't want people taking around with it but we know what we're doing so let's go ahead and run an LS to see what get deaths are made of not that LS would harm a folder or anything okay great so the parts that were concerned about for this workshop are the objects folder the reps folder as well as the head file and and finally an index file that has yet to materialize so these four things constitute the heart of gits structure and the rest of it is either either personalized configurations or user-defined scripts that are beyond the scope of this talk who has gone into the dot git folder by the way awesome a roomful of intrepid explorers great so let's all go exploring together and let's go spelunking into this folder so we'll start by diving into the objects directory you might wonder where does get keep all of the different versions of my files all of my content and the answer is right here in the objects folder so this is just a folder which functions as gets database and the term that you can use to describe how this database works is content-addressable filesystem which means just a method of storing information so that you can retrieve it based on its content wait a minute what does that mean so this means that when you put something into gets database it spits back out a key so that you can then use it to retrieve this content at any given point so let's do this let's put an object into gets database so I'm gonna go and write a file hello world I'm gonna write hello world in it because why not I'm gonna save this file and now we're gonna use a plumbing command called get hash object that allows us to copy over the files into our objects database the W flag here indicates to get that you want to write it and we pass in the name of the file that we want to store ok great so now we get back this hash this weird-looking string of gibberish comprised of 40 hex characters and in your time using git I'm pretty sure you run into these how many of you have seen these yeah so where have you seen these log yeah someone else it's a rebase yes pardon commit IDs yeah well we'll get to that in a bit they're not always committee IDs which was what we're gonna learn okay so get log is probably the example that most of her you think you are thinking of so this hash is generated by the sha-1 algorithm which is built into that hash object plumbing command we just used and the hash it produces for the most part is uniquely generated based on the contents of the file okay so to drive home my point I'm just gonna run the raw sha-1 function on this content on the command line to see for ourselves so here we go over is your print prepense and metadata file type and size here to match the fact that the hash object command automatically does it for you right before running the sha-1 function pipe that into the sha-1 function and there we go for any given set of content you reliably generate the same hash key and you can really do this any number of times and you will always get back the same hash for this content but if you alter the content in any form whatsoever I'm just go ahead I'm gonna go into this and add an S to the end of hello world maybe or we are living in a multi world universe I don't know and you can see how one little letter makes this hash entirely different so this idea of getting a hash back that uniquely matches with a set of content kind of in the way that a fingerprint is a unique identifier for a person this content is super important as it's at the heart of what makes get a content addressable file system so then let's use this hash to retrieve the content we just stored in the objects folder which by the way the objects folder looks kind of like this it's going to show you the object how it's stored you see that's organized such that the first two hash characters creates a subdirectory under the objects folder and the remaining 38 functions as the file name and we can retrieve the contents by using the plumbing command get cat file which allows you to inspect any get object the P flag here stands for pretty as an a readable format and we pass in the hash of the corresponding object and there are the contents of the file but actually if you try to read this like any other text file I'm gonna try to cat this what you end up seeing is complete gibberish right so the files that get saves into its database aren't just stored as raw file copies of what you have that makes no sense right if you think about the fact that git has to operate on potentially massive code bases I think for example of my previous company which had repos of hundreds of thousands of lines of code and just thousands of files it really would not be scalable rather the contents are compressed into these much smaller objects and Z Lib is the compression library that they use so now to further explore how git saves files into our directories let's apply what you've just learned through an exercise we're going to explore basic versioning and so the point of this exercise is to examine how it saves the different versions of your files into the get database and here it is so the first part of the exercise from items number 1 through 4 we just did that so you'll be recreating what I just did 5 through 7 is new territory so I strongly encourage you to pair just for one so they can meet people that's always fun but for another to discuss more in depth what you've learned and sort of hashed out the concepts and as I said if you didn't already download the cheat sheet beforehand you can just go to my github account - the gits guts commands repository you can just open up the PDF from there so I'm gonna pick off where we left off as I said I already went through items 1 through 4 so we'll start on item number 5 which is making version 2 of the same file okay so I'm gonna go clear the screen really quickly I'm gonna go and open up this hello world full file that I've made I'm gonna go ahead and stick an additional line into it like you guys did you stuck an additional line into your fault file and I'm going to save this and I'm going to write the hash object command to save it to my git database passing the name of the file the same file here and we get a different hash back right because the content is modified so a different hash must accompany it so now if we show the contents of our objects folder I'm just going to use a shortcut here for that find command we've been using okay so we see that there are now two get objects stored in the get database and if we open up this object that we just created what you see might surprise you so it's the newest version of our file in its entirety so when you sort of talk to your neighbor as according to the instructions and discussed what you thought you would see when you opened up this project how many of you sort of guessed correctly awesome we got quite a few correct deductions so I taught a workshop version of this talk at my previous company and in that time I found that consistently developers tended to think that version 2 of a file in the objects database would be a diff off of version 1 who guess that would have been a diff yeah yeah who didn't know what to think yeah total fine as well so I think that some other version control systems like subversion stores diffs but git is a different animal for each version it copies stores an entire copy of your file initially there is however a follow up to the statement but we'll get to a little bit later on in the talk so another thing to note is that git uses the hash dude it we're done with the exercise by the way and we're moving on so another thing to know is that git uses the hash to detect when a file has changed and will be thus more selective of when to store new object so if you try to store a file that is exactly the same line for line character for character into your objects directory it will detect that it already exists and it won't duplicate it but it will spit back at you the same hash as you see here and this is true by the way if you saved a dozen of the same files with a different name into your database and really that's part of gets beauty it's pretty space oops so as I said it's true that you will get back the same hash it won't save it if you say that doesn't have the same files with a different name into your database and that's really part of sort of the space efficient way that it is designed ok so this type of object that we've been talking about and that we've been inspecting has a very specific name and I very briefly very briefly mentioned it before but for the sake of demonstration we can use the plumbing command cat file to inspect the object and this time we're gonna pass in the T flag type so it's indicate type okay so someone read that out to me was it's a blob yeah I just wanted to sure you guys say that so blob that is the name of these objects that we've been creating and they are super important because they are the primary object store and get containing all of your file content so how do we know what file name this blob goes with or how does get represent saving multiple copies of a file under a different name more importantly what if we want to group a bunch of files together to create a snapshot because that's what get is right it's not just a snapshot of one file but of your entire folder at any given point so there's another type of object for this that gives us this information of layer this layer of information we're looking for so we put two blobs and what we're gonna look at now is called tree objects okay so whereas those blobs we saw correspond to your file contents well you can think of trees as the complete snapshot of your project directory so we're going to make a tree to demonstrate but first we need to make an index file because that's what get makes trees from so what is an index file well you actually know what it is because there's a user-friendly metaphorical term for you use it for it that you see all the time does anybody know what that term might be it reminds you of like the theater I'm on stage that's right yeah so staging area and index are one in the same so to move forth let's stage some files aka putting stuff into our index by using the plumbing command update index and the add flag here specifies that we're adding to the index and we pass in the name of the file that we want to stage and now I'm gonna go ahead and add two so yeah another file inter index just for demonstration purposes fubar lots of gibberish and so now I'm going to update index and this time I actually don't have to manually call hash object beforehand because update index has that functionality built into it if the blob isn't already there in the objects database it will be automatically added under the hood and now if we look at the dot git folder we see that we now have an index which was absent before okay and if we inspect the contents of the index file with this plumbing command which you will use get Ellis you can see that it's just a running list of all the stuff in our staging area so this is all staging really is under the hood and if you run a porcelain you get status you will see that we have indeed it added stuff into our staging get status there it is right so by now we've saved our files into the get database and updated our staging and you might have guessed which porcelain command we've basically done except with low-level plumbing what is it add get add right so this is all the low-level stuff that happens under the hood in a get add that's pretty cool right yeah so now that we have an index aka staging for the tree object to base itself off of let's go ahead and finally write that tree get right tree that's the command which you will use and as you would this expect of all objects that you write to your objects folder you also get back a hash and let's examine this get cat file key for pretty passing the hash and there you go so this is what our tree object looks like and you'll notice that it contains the file mode the object type blob a reference to these sha hashes of the blobs along with the file name one thing to note that isn't being shown here is that in addition to pointing to our blobs trees can also reference other trees too so as to illustrate this notion of a subdirectory and if you are paying close attention you might notice that this tree object in fact looks almost exactly like the index file we just looked at and that's absolutely the case it's very similar so what's the difference well unlike the index file which is meant to be in a constant state of flux because if your staging area and it gets changed around constantly right this tree object is a finalized snapshot captured and persisted into your git database does that make sense okay and you can see that this is the case given that we now have another item in our objects database okay great so now I'm going to give you 20 minutes to do exactly what I just did so that you can digest everything and do further explorations yourself and if you finish early there's an additional challenge built in and once again I heavily encourage you to turn to your neighbor and pair so now it's official we now have a snapshot of our current working directory stored into our git objects folder and that is as you saw the tree object but it still seems like we're missing some metadata you don't have any information about who saved these snapshots what time they were saved or why they were saved so enter this concept of the commit object okay so let's create a commit object will use the plumbing command commit tree so this is going to echo the commit message and pipe it into this commit tree command we're gonna pass in the SHA hash of the tree object and like with all objects that we've dealt with so far we get a Shaka Shack and get predictably sticks it into the objects database as you can see great so now we have more stuff in our objects database and if we take a look at this tree this commit object we'll see that it looks like our typical commit so this is pretty familiar right this is familiar to all yeah so very importantly we have the hash of the tree object that this points to we have the author which is me we have the committer oh that's also me and then we have timestamps and finally the all-important commit message I really botched this commit message I hope none of your commits look as vague in general as this okay so as of now by writing a tree and creating a commit object off of it what high-level get porcelain command have we now effectively recreated using low-level commands git commit that's right so we've now done the low-level workings of a git commit that's pretty awesome right yeah so let's put another commit on top of this one because I want to demonstrate how git relates commits to one another let's say that I've edited the file fubar for this commit add another line more gibberish the sky is blue okay not so gibberish fubar but that's Divis save this file I'm going to update my index and now I'm going to write the tree I'm going to echo my second commit message I also bought this pipe this into my get commit tree plumbing command and here I'm just gonna pass in the Shaw hash of this tree and here I'm going to chain the commits so the P flag here stands for parent telling it to link this commit to a parent commit the one that comes before it so as to indicate a sense of hierarchy and we get this hash back which is what we expect so let's take a look at this commit object-- that we've just created so looks pretty similar to the previous commit that we made right however there is one thing that's different does anyone notice what's different here parents that's right so it says parents along with the hash of that previous commit and now if you run a get log on the specific commit hash you pass though you write stat because you're doing it on a commit will see that how by simply chaining these commits we're actually starting to build up a commit history that's awesome right okay so now it's your turn to create a commit object and I will give you all 15 to 20 minutes and again if you finish early there is a challenge discussion so I'm actually forcing you to chat with your neighbor this time I know I keep saying collaborations great but now I'm making you do it okay so who got to the challenge question raise your hand okay don't worry about this next part this next quick bit if not but raise your hand wait so raise your hands if you think that a new commit object will be created now raise your hand if you think a pre-existing commit will simply be altered raise your hand if you have no idea what to think yeah okay so when you perform a git commit amend who the way does that who by the way doesn't a lot I do feel like constantly it's hilarious so when you perform a git commit amend you ends up creating a new commit object because remember it always needs a file name associated with a sha hash which is generated uniquely based off of the contents right so it's its way of verifying data integrity so once you change a commit message for the commit object the contents of that commit object are now completely different so the sha hash ends up completely different and you can't just reuse the sha hash from the one before right okay anyway there you go oh and by the way this prior commit object with this old message will actually still exist in the objects database but it will just kind of dangle around completely useless and it eventually gets removed yes yes absolutely you could absolutely absolutely yes that is absolutely correct that is absolutely correct yes exactly that is exactly yeah whenever stuff like this happens roughly with ref log is your is your go to yeah me too actually awesome so anyway those are really good points that were brought up you can look at the ref log and also is a useless object but eventually gets removed during a garbage collect which we will go into further in detail in a little bit okay makes sense to everybody great so anyway there you have it you've now performed all of the low-level you've performed a low-level get add and a low-level get commit and if you were following along you'd probably notice the very tree like way that get is structured and I'm going to go ahead and do a high-level overview of get structure as you've seen it so that you can sort of congeal all of this or sort of weave all of this together okay so at the very lowest level we have the contents of our files which is the blob and each revision as you saw was a different blob and then you have another layer built on top of that in order to associate the files in a snapshot which is the tree object and then on top of that we have another layer of metadata the commit objects which are chained together for the purpose of forming a history and everything is chained in one direction by the usage of hashes to point from one node to the next the children always knows it's parents and never the other way around and if you drop anywhere on this chart and follow the pointers you will never end up back where you started thus it's a dag directed acyclic graph which is a graph that contains no cycles who's heard of this term by the way awesome computer scientists in the room and these hashes are generated based on the file contents which in turn as you saw contains the hashes of any preceding nodes this creates a chain of dependencies in which the hash of each subsequent object depends on the one before it in this way git is also structured as a Merkel tree who's heard of the term Merkel tree before fewer people it's more obscure but actually as I it turns out that this is the same structure that Bitcoin bases itself off of pretty cool and as it's both a dag and a Merkel tree some like to call it by the hybrid name of Merkel dag okay so regardless of what you can what you decide to call it you can see what advantages this type of structure might hold for one because of the hashing in that the key is uniquely generated based on its contents you can verify that the data you put in will always be the data that you get back out which is a way to as I said maintain data integrity if there's any corruption get will absolutely notice it okay and at the same time it allows for deduplication of any common children which I mentioned earlier at the blob level right save any file same file but different file name you're gonna get just one object back okay for another it makes for a highly flexible super fast piece of software given that we can content address any of the nodes in the data structure because you have all the content and it's just a matter of pointing to it all via the hashes which brings us to our next point which is references so now we know that git stores our information in three primary object types what are they at the very lowest level what do we have blobs then what do we have trees and on top of that commits awesome and it's all just kind of floating around in the objects directory yes there it is um but when we work with its how do we keep track of what commits we want to work off of well usually what are we working in branches right so there's our answer branches but what exactly is a branch how does get know what objects go with a given branch and it's actually extremely simple branches and get are merely aliases or pointers to commit objects okay and these pointers reside in the folder dot git slash refs slash heads refs meaning references and heads meaning the top-level commits for a given alias and this directory should contain a running list of all the branches that are in this repository but as you see it's completely blank right that's because we don't have any branches yet well let's change that so we can use the plumbing command get update ref to do so and we pass in the name of the branch that we want to make master and the commit object that we want it to point to which was the last commit we just made and now if you list your dot git flash ref slash heads folder you see that you now have a master branch inside of it okay and if you open this guy up lets you you'll see that all it really is it's just a text file because I could cat it it just contains a hash of the commit object that this branch points to so in effect what we've done is this oh okay yes oh goodness okay there we go so we've created a reference to the commit okay so that's all there really is to a branch really now oftentimes when you are working in get you are branching off of master so actually let's try that let's branch off of master and see what happens get checkout be create a new branch and check out to it feature I hope your branches are a little bit more specific sounding than feature okay um okay so it's all done running so what it where am i okay yep so you see that now we've created oh now we have another branch in our branch heads folder right and if you open up this branch to see the hash what do you think you'll see by the way same yes you'll see that the hash is exactly the same as the one for master so we've effectively created two references pointing to the same commit and visually it looks something like this here's your master branch you're going to check out to a new branch that's what it looks like okay so that's what happens when you freshly branch off with master but let's say that I've edited some files and then added a commit on top as this is normally what we're doing when we're working on a new future branch right I'm going to edit the hello world file add some more lines to it the sky is blue it's the truth actually is it blue well whatever I'm gonna do some I'm gonna do some porcelain get add just to speed things up here I'm gonna do a git commit terrible commit message once again okay and now if you open up the branch file what happens is that the branch now points to a new commit right and if you take a look at the get log you'll see that the top hash corresponds to the latest commit and now it's all chained it's changed to all the preceding commits okay so visually it looks something like this we changed a file and then we staged it and then we committed it which creates a commit object and links it to the prior commit and at the same time it moves the branch reference so that it now points at this new commit object and if you wanted to check out to master and get your master revision back what would happen is that git would read the master branch file which would contain the commit hash and from there it follows the chain of hashes through your trees until it gets to the relevant blob objects and from there it unpacks those blobs into your working directory okay so you're probably wondering now well how does get know what branch we're currently working on so that when we do a git commit like we just did how do we know what branch to move that commit to and the answer to that who knows it by the way who can guess okay I'll tell you the answer to that is the head file which resides on the top level of your dot git folder and this is just a text file that points to the path of your branch so if you cat it there it is right just points to your branch so git log get branch along with a bunch of other commands that you run when you want to info on your current branch reads off of this file I think some of you earlier on we're trying to do a git log without the dash dash stat and passing in the commit object you can't actually run like a git log unless you have like your head file pointing to a branch file that's why I didn't work beforehand okay so anyway when you bring the head file into the picture the diagram kind of looks like this now it kind of ran out of space there okay so when you do a check out to another branch under the hood you edit the head file so that it then points to that new branch that you're checking out to so if I were to check back out to master what would happen yeah head file would move to master yeah and by the way you've probably also seen the term detached head state floating around who's seen that term isn't the funniest term ever but what does it mean it's very memorable so if you're wondering what that is it's when you do a get check out to a commitment no branch points to that's a detached head state okay does that make sense okay so the overarching point that I really want to make here is that branches are not some fleshed-out entity I used to think that they were right rather they are literally just a text file with a pointer to a commit hash in it and we do this because we need a human readable and meaningful way to reference the commit object that we want to work off of because those forty hex character hashes are seriously hard to remember right yeah you can't remember that but we'll certainly remember something like master and feature actually hopefully a better branch name than feature okay so we now have another exercise where you can go ahead and explore doing what I just did I feel like the challenge will have too many results so there's no like one right answer to this does anybody did anybody go through the challenge and want to sort of sort of talk about what they discovered if anyone found anything interesting any brave volunteers I think I see a hand okay awesome stand up if you want or just talk loudly or I can come and awkwardly share my mic with you mhmmm interesting and did your working directory get updated to reflect what was on the new commit that you changed - mm-hmm okay awesome anyone else want to share like some weird things or cool things or interesting things that they saw for the challenge cool awesome thank you for sharing what's your name Nathan thanks for sharing Nathan okay cool so one last topic and that is get pack files so I mentioned earlier that get saves a copy of each and every version of your file right so if that's the case you might start thinking to yourself geez that would become pretty clunky pretty fast right your point all thinking that yeah yeah so let me just show you it's gonna find look at my objects database okay so yeah look at this this is the smallest repo ever but it already has this many objects how does the scale right so what I've been showing you is loose objects so the thing is that get sometimes automatically packs those loose objects up into a binary file called pack file in order to save space and efficiency by using Delta's or dips right I think I said at the beginning that like yeah it doesn't use Tiff's but there's a follow up to that this is my follow up okay so we can actually manually reenact this process through the get GC command and the GC here what does that stand for who knows garbage collect that's right so with garbage collect get performs a rather complicated algorithm to determine which of these objects are similar and then picks a base to then make the deltas which is the difference between the objects and stores that instead which saves quite a bit of room so before I run this command to start packing things up let's run a git count objects to see what we have so the H flag here stands for human readable as in like pretty make it readable okay and we see that we have a count of 11 loose objects for a total size of 44 Kb so now if we run get GC cases it counted objects then did Delta compression and now if we count the objects you'll see that we have way fewer loose objects hanging around and the total size has significantly shrunk yes counting objects uh-oh maybe this is saying it's counting the ouch I don't actually know but maybe it's saying it's counting the objects that it can differ of dude does anyone here know perchance huh it counts unpacked objects only yeah that's interesting okay perfect yeah I think I got I got rid of the V because I didn't want to like clutter the point but okay cool yeah you guys can run - B and C - V and then you can put the capital H there to see it in human readable form as well okay thank you for that okay so now we've seen how it like significantly shrunk this is like way less space which is awesome and if we do a find on our database again you'll now see something entirely different so the pack file which is the guy in the last line is the single file containing the deltas and the index immediately above contains offsets into that pack file to quickly reference objects and if you run a plumbing command get verify pack - me you'll see that all the things that have been packed up and there it is this is all that's been packed up and that's it for how get save space and manages to stay even more space efficient yes yep ok I can't actually scroll back up cuz it's a recording so you're saying that oh wait I started saying oh rat rats ok ok Oh head so you're saying in this slide right was it this one yep mm-hmm oh yeah yep that make a lot of sense you probably don't want to compress the the like thing that you're currently on very good ok awesome so actually that kind of concludes my workshop for how it works so I hope I've managed to make get demystified for you but really the best way to learn about this topic if you're interested is to go and dig for yourself so I wanted to share some of the resources that I've used and always relied on over the years the first item is Pro get by Scott Chacon and Ben Straub it's a really amazing an amazing open-source book and you've probably seen it floating around online when you've googled for get help who's like seen this by the way yeah it's really great book so if you learned a lot from this talk you'll learn even more from reading the source as it is my get Bible and I've structured a lot of the talk around some of its content and drew from it's really brilliant explanatory approaches so I'd like to give a lot of credit to this book other perspectives on get internals is Mary's Mary Rose cooks get from inside out and Josh Wiggly who explains from gifts from the bottom up us in many ways of tackling the same question and a particularly great talk I found is by Matthew McCullough I think that's how you say his name if you're more of a video learner and if you wanted to learn even more about get pack files and get GC since I like very briefly went through that you should check out a fantastic article by Aditya Mukherjee who's also did the Ricker Center and that's published in recur centers codewords and lastly I recently discovered this really gorgeous blog post by Tyler Cipriani a Cipriani that does a very thorough job visualizing gets merkel Dagg structure in d3 and next I wanted to thank everyone who helps me in preparing this talk by providing an audience for a dry run and offering feedback and I'd also like to give a round of applause for our TAS at a NIC thank you this is super helpful it's about you guys and it by the way it will be giving a similar 40 minute talk on Thursday at 420 in ballroom G called deep dive into get so I highly suggest that you attend not only because ed is awesome but also because it's really great to hear two perspectives of explaining the same topic in order to really come to understand it okay and next I wanted to thank the sound person for being awesome and OnPoint and sort of like rolling with the punches several and applause with sound person hi what's your name kick cable okay let's let's all flat scale yeah thank you thank you so much um and of course I want to thank you for being a great class so if you have any questions or feedback I can take them also feel free to tweet at me at any time I'll respond there I like Twitter huh post my slides um so I guess my my concern with the slides is that they it's massive because it's all we're like the recorded videos that I put into the slides directly so you might not be able to I might not be able to post them I'm gonna try to find a way to and I'll post a link to it in that repo the dissecting gets I'm gonna make it I'll make a PDF of it but I'm not sure yeah you might not be able to video in but all um oh yeah I could do that actually yeah I'll do that so yeah yeah actually that's a good idea I'm gonna do that okay okay so actually I did do a talk basically the same talk at a conference called git merge which is like the get conference so if you wanted to see like essentially the same thing - the exercises and - like some silly commentary here and there you'll be able to find it if you search for get merges 2016 videos I think they'll post them in the next couple of weeks so you can re-watch the talk there and I will post them up it will be in the that repo that I sent you to with the PDF cheat sheet and maybe yeah yeah and maybe I'll also post a link to the talk itself the get merge talk in that repo so just check that we go out yes yeah there you go any other questions okay awesome I hope you all had fun I had a blast okay awesome great
Info
Channel: Emily Xie
Views: 4,747
Rating: 5 out of 5
Keywords: dissecting git's guts, git, git internals, emily xie, oscon, oscon 2016
Id: YUCwr1Y6bFI
Channel Id: undefined
Length: 56min 37sec (3397 seconds)
Published: Thu Sep 15 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.