Introduction to Docker

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

ludwigDoc any dockers? ludwigDoc

👍︎︎ 1 👤︎︎ u/X1Phaser 📅︎︎ Aug 25 2019 🗫︎ replies
Captions
everyone so my name Solomon the founder and CTO of duck Cloud and I'm gonna talk about docker all right so all suspecting that I would be talking to an audience of mostly technical and expert people I looked for the shortest slide deck I had and what I'm hoping to do is dive into demos if you guys are down for demos and then as soon as possible since I know at least some of you have been playing with docker and are thinking about different ways of using it hopefully we can go into questions and we can steer this in the direction that you guys want so first things first docker is an open source project it's a project that we released at cloud a few months ago and by the way doc cloud is a platform as a service so we deploy and host and manage applications for developers who don't want to spend the time and the money in managing the infrastructure themselves so if you guys know about Heroku then doc cloud is a competitor of Heroku and so we've open sourced docker which is really at the core of a lot of the technology at cloud and it started as a side project and then surprised us basically by becoming pretty much the most successful thing we've ever done and these numbers are completely out of date by the way not it's I think now we're more around 160 contributors think we're just reaching 6,000 get up stars and the project is basically it's five months old right so we've imagined you know imagine put yourself in my shoes I'm we're putting out the side project seeing if people are interested and then a few months later 160 people actually contributed code into it so we've we've had to organize pretty seriously to deal with that unexpected popularity and obviously it's a very good problem to have so I'm going to talk a little bit about why this is so exciting by the way this is the ego slide a lot of engineers are playing with it experimenting with it and although it's not production ready some people on this slide may or may not be ignoring our advice to not use it in production yet so so it's it's it's popular and so the the typical question I get is why is it so popular what's the big deal especially if you know about the underlying technology you might be wondering well I knew I knew about containers so what's you know what's changed so the high-level story here the reason I think it's popular is that basically shipping code to a server is really hard it should not be hard we are we have become experts at working around that difficulty as a profession but really really it shouldn't be that hard it shouldn't take that much expertise right especially if you compare it to getting code into the mobile device the level of tooling that is available to everyone out of the box is just much you still needs a lot of work let's put it that way and so you guys know all this but also I'll keep it short but you know once upon a time the typical stack looked like this the server the framework the stack and that was it right the server with a capital S and now it looks more like this right you've got a software stack that is much more distributed and complex and diverse it's service-oriented its loosely coupled components different languages different frameworks that might change over time and and that software stack needs to run a hardware infrastructure that itself is becoming larger and more complex and more diverse right it starts with your laptop your colleagues laptop the the test machine QA your your in-house cluster the public cloud cluster cross cloud deployment increasingly you've got software companies that offer a SAS service and then their customers say hey I would like an appliance of that right can you ship that as an appliance by the way I run this or that infrastructure set up and so you have a more complex software stack that needs to run on a more complex and diverse hardware infrastructure and the result is what I call the the the matrix from hell or the matrix of doom which if you're in the profession of writing or shipping software you're familiar with it and when we're the other you know you've got every software component in your stack multiplied by every place where it might need to run one day and your job is basically to make sure that every intersection of that matrix somehow works right tests pass this in the same way on every intersection if the tests pass on my laptop then hopefully the tests pass in production and how dealing with those possible differences right so the typical scenario where that doesn't happen is where you know you're developing on your laptop and it's it's a certain version of Python a certain version of you know the Lib C it's a certain distribution and then it works on your machine and then you ship it to production and it's not a good - it's it's red hat or CentOS right it's not Python 3 it's Python 2 6 etc etc multiplied by all these different rows and columns so in a nutshell what we're trying to do with docker is contribute to solving this problem and one way you can think about the problem and looking for look for possible solutions is finding a group of people who've that that have that problem in the past and sold it right so the analogy we like to use is the shipping industry if you need to ship stuff somewhere in the world for centuries the problem is basically well I eat the ship coffee means I need to get them on the other side of the world it is my problem how that's that those Goods or handle every step of the way right what what kind of truck is going to be used is the staff trains to handle those kinds of bags in Rotterdam etc etc so the process of shipping my stuff is very tightly coupled to my stuff right which means that every single provider of goods has to have an expert in house and has to map out routes and think about plan B and have all that knowledge and infrastructure in-house and so it makes for a very brittle and expensive and unreliable process and of course you know you can represent that by the same matrix from hell right every possible goods the ship mult apply by every possible way to ship goods same problem except they solved it right they came up with this they came up with the shipping container they all agreed on a box they they agreed on the size the weight the dimensions that all the dimensions how the doors works where the where the locks are where the ID label is and then infrastructure provider is standardized on that and and people should be good to standardize on that too and so now all of a sudden you get separation of concerns right I'm shipping coffee beans now my problem is simply to put the load my goods into this box seal it and once it's sealed it's no longer my problem I can hand it to a wide variety of infrastructure providers I can organize in such a way that later new infrastructure providers or new infrastructure tools can be added I don't have to repackage my coffee beans just because we're not going to go through Rotterdam anymore and conversely if I'm developing infrastructure right I'm imagining shipping routes or I'm building trucks or boats I I can standardize on that and focus on a differentiation right faster boats cheaper better organized facilities etc so separation of concerns brought efficiency automation and really that that changed the world economy right the because it's so cheaper so much cheaper and more reliable to ship things I mean I could probably point to 15 examples in this room of things that wouldn't exist or you know things we wouldn't be wearing or using if the shipping container didn't exist so the goal really is to try and do the same thing for software right because I think it's embarrassing personally that in on average it'll take more time and energy to get a distribution system you know a collection of software to move from one data center to the next then it is to ship physical goods from one side of the planet to the other I think we can do better than that collectively and that means software right so so that's what we're trying to do or it contribute to and the way we're doing it is we're defining we're trying to define a standard format that is simple enough and sturdy enough that a critical mass of people can agree to use it and integrate with it and so really the metaphor is that you know if you're a developer you're those are the ghosts or that's the software that the bits are they're docker gives you a standard way to pack that into a box with standard properties and then you can hand that box to tool makers Ops teams infrastructure providers and they know how to handle the box and they will handle the Box in their own particular way while you know exactly what's going to happen in the end right you know how things are organized inside so you worry the developer worries about inside the box infrastructure worries about outside the box and then things are interoperable repeatable and ultimately they're cheaper and more reliable so that's the goal that's the most high level possible thing I could have told you that's what I just said and that's the end of my slide deck so I told you virtually nothing in terms of how we do it or why you should use it but that's the goal I think it's you know it's a good idea to start with that so before I dive into technical details demo do you guys have any questions the question is isn't there such a box for my already and actually then we have already too many of them is this the case of creating the 15th standard to fix the 14 previous standards ok so don't you don't read hacker news what you reading xkcd and get it yeah so that the starting point of all this you know is that we as a team did not kind of set out to create a standard for the sake of creating a standard the the so it as a starting point answering question you know as a provider of a hosted service right a platform of the service where our job was to run and deploy web applications API endpoints databases we kind of the the value proposition without cloud we can run a lot of things for you not just the rails app but you know a lot of components of the stack so we were faced with this problem of if we were going to run a lot of different kinds of things for a lot of different people for a lot of different developers let's see what's available that we can tell them hey package it in this way and then we'll run it and so VMs are definitely an option I would say they're on one side of the spectrum right and then on the other side of the spectrum you have you know VMs being the least application specific thing there's ouka they're kind of lower in the stack because you're basically shipping an entire machine virtualized machine but still a machine and on the other end you're shipping a static binary or maybe you're shipping a jar if it's a java application or you know a Python package or maybe a system package and then you have something that's much more lightweight you need to you need to send less bits because really there's more context right if I send you a jar it's implicit that you already know I'm going to send you a java application so normally you know I would expect that you have some sort of Java deployment environment already in place and I'm only shipping to you the missing parts right the problem is the part that I'm not shipping has to be this you know I have to know for sure it's going to be there and the problem is when I'm sending a jar I don't know for sure what's going to be there and then a larger problem is that increasingly because of this distributed complex diverse software stack it's it's increasingly unlikely that the entirety of my stack will be Java right or will be Python and even if it's Java I might want to rely on a specific version of Tomcat or a specific version of the JVM etc and so how do I the developer make sure that it's that version of the JVM that's going to be using on another for example so if I want to do that if I want to ship everything around the application right all the way down to what version of the application server what build what exact configuration an exact an exact build of the lib see of everything then the only option that's left to me is a VM so the VM is the only existing format today that that carry that will let you ship enough of the information you need to ship as a developer and the problem with VMs is that they ship too much right you're shipping a whole machine and the and that that causes multiple problems one is that it's it's a lot of its you're sending files that are very large it's also a lot of overheads in deploying them and the overheads manifests itself both at the very low scale small-scale and large-scale right it's if I'm going to do integration tests of a stack of 10 components on my laptop that means I might you know if I'm using VMs as the unit of delivery then I means I have to deploy 10 VMs and 10 VMs on my I mean this laptop will definitely not take 10 VMs so there's that there's the problem of just performance and overhead and then the other end of the spectrum when you've got massive scale of deployments like you guys have where you know a given payload may need the entirety of a machine the overheads of the the VM might actually not be worth it either because would you rather buy 20% more machines or get rid of a 20% overhead right so I'm only partially answering a question but really the point is there's a missing middle ground where the developer ships enough that he's he's the the the he's specifying enough that he can guarantee this is the same thing that will run over there that I that I've tested here so it ships enough but it doesn't ship so much that now he's telling the ops team how to do their job right here's how you know here's how storage should happen here's how much RAM this this will have this is an Ethernet interface Nats is in place these are ops decisions right so it should be possible to ship the same application bundle to run it to run the same application bundle once on an SSD array once on a clustered file system and ones I'm just shitty local storage because you're just testing without having to rebuild the application three times does that doesn't make sense so so that's kind of the the problem right the the the reason you know the reason the problem that the tool is missing today don't solve the problem in a way that's really satisfying and then at the same time new tools are now possible because Linux just got a hell of a lot better and now it's capable of cleanly sandboxing the execution of processes so with the namespace and control groups features you know that the set of feature is commonly known as containers now you gets possible to basically that the the low-level primitives are in place in in the Linux kernel by default to basically execute processes in a way similar to the to the way they're executed on on an Android device or an iPhone for example so it's possible for me to drop into a server a set of files and an executed process in that directory in a way that is completely sandbox from other processes and that is really nice because now I can have this really clean primitive for deployment I can I can install an app without worrying about how it interferes with other apps that are installed I don't have to worry about conflicting dependencies to different versions of the app requiring two conflicting versions of a given library the library will just be installed twice once in each container right and it's only that development is fairly recent right we at AB Cloud for example we've been using containers for a while a lot of people have but the problem is that you had to build your own custom container based system if only because it required patching the kernel which just that's it you can't you can't have a critical mass of people using containers to share and reuse software components because most machines out there can't run containers so now they can which means we can we can now define a new unit of software delivery that's more lightweight than the VM but can ship more than than just the application specific piece so docker is a rewrite of the system we use in dock cloud so it's not the actual same code and our rough people ask us you know when will you say it's ready for production roughly speaking we'll tell other people they can use it in production when we're comfortable using it in production which the the progress towards that is fairly rapid because we're very we know the design it's very similar to the stuff we do run in production today it is a rewrite and the reason it's a rewrite is docker is much less about cloud is a platform right it does it does everything for the developer using it from setting up you know resource allocation strategies across clusters and machines to load-balancing to all that stuff docker does less it's a more concise tool and one of the key design principles behind docker is that it should be possible to use it as an ingredient in your existing platform for example one of the conversations that's going on in the docker community is mezzos is really cool maybe we could use docker as an ingredient for mezzos deployments right conversely we want to use docker as an ingredient for not cloud but in order for that you know that's just the way I guess system design works if you start a component that docker is grandpa it's too late for Dockers grandpa to be ripped out it's just it's dr. doc cloud specific but we can through a rewrite we can have a reusable version one of the concrete applications of making docker an ingredient is we've try to make it as easy as possible to drop in and as actually physically drop into a system so docker is a static binary that you download onto your server and execute first you run it as a demon and once the demon is running you can run the client you can just type commands and it will pass commands to the doc to the doctor demon and you're in business so right now this is a VM running on my laptop with the docker daemon running in the background so I can start typing docker commands so the first thing I'm type is docker PS so notice PS you know is reminiscent of PS listing processes docker has a process-oriented api and that is one important distinction with existing container tools you can use Linux containers without docker right we did not invent process isolation or Linux containers but a lot of the tools out there will focus on using containers as basically miniature servers right you set up a file system you boot it and it's just like a VM but it's way faster what we're interested in is using containers again as a unit of software delivery so really an envelope running an application which means that a lot of the commands and and and API calls are process oriented so here I'm listing and basically asking hey what processes are running in their respective containers nothing is running so I can dock or run something and in order to in order run I need something to run so we have a concept of images which is basically you know file system States basically tarballs that you can start from to exit and then you execute processes inside the those images so this is what I meant by riyal characters or abel lines choose one I'm trying to find a good trade-off here oh this is nice is this still readable okay so a lot of output here but basically it's a list of images on the left side you see names and so this is my dev machines I have a bunch of stuff lying around what I have here is an Ubuntu image so if I filter it down I have a few images called a boom - and they have tags to indicate versions so here you can see I've basically have a base system a base Boone - 1204 system base 1210 system so let's use that so I'm telling docker hey run a process inside the ubuntu image let's say 1210 and the process I want you to run is been bash right and then I'm just going to add a flag to say hey I would like to attach to the input of this process so I can type things in and get in addition to having things printed and that's - I for input and - T is and also I would like you to allocate it TTY so that things look pretty so if I do that it's going to create a new container and so this is a shell and a brand-new container so docker just created a copy of the base it would do filesystem set it up you created a Linux container in there set up networking so I would have a networking interface give me an IP and then executed the process I asked for in there sandboxed and then attached to it so I can see the inputs and outputs so notice that all this happens pretty quickly so I can do it a bunch of times every time I do this I am creating a new container and then executing a process in it yep sorry you can run in it yeah oh yeah out of the right now what I'm doing here does not run in it so it's literally well actually here I'm in a container right if I look at all the process the process is running this is all there is is only bash and you know PS which I just ran that's that's containers right my shell is completely isolated and it has a full filesystem all to itself and the cool thing is it's a copy so it's not sharing it with anyone else or that's the key primitive every time basically the fork primitive that we all know when you execute a process you fork the process now you can also also fork the file system that is sandbox in the process at the same time so what that means is now I can do whatever I want on this filesystem so my so that's a few times ago when I give this demo someone said prove it and so made me remove all the filesystem this is always a little traumatizing so I'm just gonna remove our and UTC just that yeah okay and another container yeah so I'll go into the resolve calm things in a second but I did remove our okay okay so like I removed bin so I'm screwed now but that's okay cuz I can exit and then run a new container and then it's back right because it's a new copy so you get this the first thing you get out of this is you get really a really easy way to throw away a setup and and just start a new one and you know exactly where what your starting point is right so that is really that's a really nice solid foundation for a lot of things on top of that right whether it's configuration installing applications build etc you always have a safe place to go back to and the other way today the alternative to that today is setting up a VM and then snapshotting the VM and reverting to the state of that VM which may or may not be practical I definitely know I don't want to snap shot my vert my VM every time I want to go back it's just too slow too cumbersome I just do this so so yeah that's changes of the file system the the other thing I want to show you guys is what do I want to show you oh yeah snap shouting so let's say I'll just start a new one and then I'll remove var again I ruined our give a demo you say where remove our and it went well so I'm just going to do that ah come on let's be let's be brave here so let's say one day I'm definitely going to remove it in my host machine in I'll be screwed so let's say that is a change I actually want right Oh awesome I remove our I want to reuse this so I'm keeping this on the left and here I'm just logging into my game again and then let's list what's going on here so you can see here if I run docker PS remember it listed nothing now it lists it shows that container running right Bosch running started 46 seconds ago the starting image is went to 12 10 this is a unique ID everything has a unique ID always so you always know okay I'm running this and not something else but you can always point you can always track what's going on let's say I want to use this where I want to publish it actually so I'm going to say okay first let's let's take a look at the changes and so here's going to tell me oh so you've removed a bunch of stuff you idiot and but it works also for you know like if I add fresh yeah okay well you get the idea and so let's say I like this and I want to so if I like it at any given time I can say okay commit and let's call it I can optionally give it a name let's call it solomon's broken Ubuntu and there's kind of just a convention of user name slash name you don't have to follow it so okay committed give me a new ID and now if I look at my images if I look for broken then there you go you have the broken abouttwo image so now if I want I can just use it right away so run same thing let's run a shell and broken a boot tube and wait oh right I remove too much stuff okay of course I had to hit that one edge case yeah so if you guys interested in technical details we we ideally you could run docker into there's any tarball at all there are a few dependencies for example the the IP binary needs to be there because we use it to set up networking for the container but there is a pull request to fix it so anyway I don't you guys believe me right um actually so let's say okay great let's say I want to publish that so the the one of the reasons it's interesting that to agree on a really simple conventional way to do this is that then containers become portable and so they go from being a cool way to do miniature servers to being a cool way to doing basically libraries with more stuff in them so the whole point is to use containers as a reusable software component except instead of binding into it at linking you you execute and interact with it that way or over the network so let's say I want to share this beautiful piece of software I just designed so there's a built in a registry system where you can dock or push let's say I just want to push broken - boom - and then assuming there's internet it'll connect to a public registry where anyone can just upload stuff and download stuff from each other so it's just going to uploads that container and then if I go to the docker index and I look for broken abouttwo broken abode - so if you guys go to that go that to that right now and assuming you have docker installed you can literally copy this command docker pull a sex broken abouttwo and you will it'll download it for you and you'll have on your local deployment of docker the exact same awesome configuration I just created and you can run it well you can not run it and so that's that's the basic mechanics of sharing so one thing that you might have noticed is the upload was pretty fast even though this is a full copy of an abouttwo file system you know which we that the basic went to image there is pretty you know it's trimmed down but still you know at least 100 Meg's 100 Meg's uncompressed I think so how come it down it uploaded in five seconds the answer is that behind the scenes docker handles versioning so every time we commit a new you kind of fork the new container docker tracks the history of that container just like it commits and so when you're when I did docker push I did something very analogous to get push where my local docker and the remote registry figured out which parts you know which versions were already available the registry and then skipped those right and so the only thing left up low was really the diff that said remove these three directories goes out on a way and and that was it so that's fast and that's why things happen so fast so we use that copy and write mechanic McCann is 'm - for a lot of things right to speed up execution you have that really nice primitive oh let me just run a new copy it's really fast it saves this disk space locally it also saves bandwidth when you when you transfer so you can use that primitive for advanced use cases right so release chrome is really are all you need for to compose even the most complex software delivery pipelines right so for example if I'm have a continuous integration setup where every time there's a new git commit that's pushed maybe have jenkins or simpler or bamboo or something like that checks it out builds it and executes tests in it well now one thing I can do is start from the source code create a new container run the commands necessary to install the dependencies upload the code into the container compile then I have the resulting application container and then I can run tests and then let's say I want to ship that out to a hundred servers then I can do that without actually sending the whole thing a hundred times I'm only sending the diff so in a typical setup you've got a really fat base of you know the base system and then all the system packages you need maybe a puppet or chef custom configuration on top of that and then on top of that application dependencies and then on top of that the application so if you build that multiple times per hour really the only thing you're going to ship multiple times per hour is that top layer with a new build of the application which is much smaller so docker just kind of does that without but you can you still get the semantics of just pushing that image right you don't have to worry about oh let me just arcing this and then I'll create a symlink and I'll reassemble it arrival doctor just does it so you guys have any questions yeah yeah so yeah the beyond starting a shell and experimenting which is really fun and gratifying pretty quickly that the the docker user community has moved on to okay now let's use this as a building block for the real deal and then pretty much right away you have this question of dependency between containers right containers are perfect for deploying network services right so something that exposes a TCP port and then you can connect to that port give this kind of night unit of it's it's you know it's the physical it's the actual representation of that service that is abstract to someone else and then so there's two parts there's how do I expose CCP ports or just a network service as a container and then part two is how do I consume other network services that I depend on right and so the first thing we did is for the first part exposing ports there's a whole port that well there's a there's a whole networking part to docker that's built in so for example I'll give you a quick example here exit this let's say I want to run netcat right or yeah I guess net K will do so I think netcat is built in I guess not okay here's well do that the I'm going to start a shell and then install netcat and then run it the key thing here is I'm going to add a flag which is - P and then I'll give a port number like 8080 and then now I'm telling docker this container it is going to expose a port a TCP port at port 8080 just a heads up the application is going to listen on this and Expo and expects that port to be available for discovery right and so docker out of the box honor is that by sending up IP tables rules it you know it'll choose a port a public port that's available in the hosts it will set up NAT G Nats so that the port's reachable and then you know if I so the containers running here I guess I'll install netcat and then here if I look at the process wrong yeah if I look at the process running you'll see it says it shows that extra information in the end ports right it's it shows hey port four nine one five three on the host is redirecting to port 8080 in the container right and so and by the way just while I'm at it this is a really cool use of diff you can see what the what package manager is do on the system you can see what-what they put where I use that to learn how things actually work then still a lot of stuff is the short version anyway so net cat is there so I can listen on port 8080 right right and so now let's say I'm an external this is part one right there's a container it's exposing a port and so now docker makes it available for discovery so this these commands I'm typing to figure out the port to connect on imagine instead there's some sort of service discovery system orchestration that does the equivalent of this with an API call and figures out oh this really cool netcat service is available at you know this this address and then normally here all right maybe okay so I don't know why this doesn't work okay second second try all right I guess yeah so I get I get a they should be requests coming in and obviously it's not going to work but its discovery has enabled me to find the TCP port and connect to it so that's kind of built in so the idea is if you're if you're a developer packaging the app into a container that - PFLAG that i typed on the fly you can also set default values into the containers metadata just a very simple JSON file and you can you can package it so that when I pull it it actually already says hey I'm going to expose port 8080 and then I can just run it and the poor will be magically exposed so that's like a key piece of information that the developer can pass on to the the deployment platform by expressing exposed ports and the second part is okay if I'm a container and I need to discover how to connect to some other external dependency that's something that that has generated a lot of discussion because there's a lot of different ways to do service discovery and there are kind of subgroups within the docker community using different systems to do that including zookeeper and so initially the answer was oh it's just you know just containers are there and then they call out to your service discovery of choice and then it's just like outside of doctors responsibility completely the only problem with that is that you need some sort of small indirection so that you can follow that kind of UNIX philosophy of a component there's one thing it does it well and then they can be composed in any way right kind of like pipes you don't know a process doesn't know if it's standard output is going to go to the standard input of another process through a pipe or into a file it just knows it's printing out and so we need the equivalent of that for linking to another service so what we've what we're adding in the next version of docker 0-7 which is coming out in two three weeks so we it's a monthly release schedule is a concept of link where you can link to containers you can link one container to the other and once a container is linked into another it's available that container can discover you know containers I linked into it and then it can it can inspect properties of that container including the ports it exposes and that can connect to that port so for example you would link the database container into the application container and then the application container would introspect you know the first the existence of that database container and then how to connect to it the key is you know how do you do something there that's thin enough that it doesn't conflict with using zookeeper or another system the goal is not to replace those systems and doctor--i doesn't do for example anything on the wire like it doesn't do any sort of distributed communication it doesn't have a name service that multiple you know a docker host never talks to another duck or host but the way we implement this is with a pattern we call the Ambassador pattern and basically we're saying there is huge benefit when you're architecting you know your stack to four if on any given machine for example your application container depends on an external service for example a database it should not it should not count on that service being on another machine or on the local machine and you know that there shouldn't be no assumption there so what we recommend is always for each external resource that you depend on always have a local representation of it as a container so in other words if you're running if if on a given machine you're running an application container and it depends on a database you should always have a database container there locally in the same machine whether or not that container actually is the database or it's just a placeholder for some remote database service that is an implement that now becomes an implementation detail and that primitive that notion of if I have remote dependency evident if I have any of dependency it's going to be represented by container locally now it things become much simpler because the application container can now just always to do is use a very simple system to find that container and connect to it just look up oh where should I connect to and then connect and then now you can swap out that dependency and all sorts of ways basically you get dependency injection at every level of your stack for example you can run integration tests against your app and just replace the actual database service with a mock database that just parrots can responses but really is recording what the application queries to make sure the application behaves correctly so now you can run full integration tests on a local machine with no internet connection and for further OVA for a good example is you know a lot of applications obviously use the Twitter API and they have this huge problem which is hey my app my application stack now includes the Twitter API right and obviously this way your API doesn't run in my machines you know it runs a Twitter and and sure there's a mechanism for sandboxing etc but when I render when I want to run repeatable offline integration tests it's really hard to do dependency injection I have to do it within each process but if I have say a node.js component and then a Ruby on Rails component now have to do everything twice right and if I want to say reconfigure my app to talk to a different Twitter account or things like that then I have to change credentials everywhere but with the Ambassador pattern what you do is you just have a container represent the Twitter service and that container is you know maintained by the local development team with integration tests you have a fake Twitter account for dev staging you have a staging Twitter account and then for production you have a production Twitter account same thing for data base so that's how we address the problem of wiring containers together and you know not going to the specifics of how discovery happens but it's just very it becomes very simple all right it's just instead of you know there's a simple mode where we inject a few environment variables into the container so you can introspect that and there's a more advanced mode if you want to watch changes you know if new containers are lit you know you get new dependencies added dynamically things like that we exposes a tiny little Redis database that you just get each container gets its own little rhetta's database with the live representation of what it can connect to and then the idea is if you're deploying that in a massive mezzos based cluster for example that would actually all be backed by mezzo service discovery we've explored all options we've had like a prior alpha version that allowed for some merge mechanics but really we've kind of we just realize it wasn't worth it for at the end of the day what what kind of blurred things is that there's a lot of get like operations so it's tempting to say oh well you just it's get right so if get does it docker should do it but the key thing is a docker container is not source it's really more like a binary right and so you know you don't merge binaries usually and you know you don't change them in place either when you're updating it's more than the typical processes more you know for a given for a given check out of the source code at compile a binary right so we have that one-to-one mapping and so if there is a new version regardless of whether it comes from emerge or just a change or what anything we've we've determined that the right approach in docker world at least is to just rebuild the container from that check out but the trick is that the reason that's a viable option is because thanks to a you have you know to the copy and write stuff and just the general implementation details it's fast enough and cheap enough to rebuild that you can actually you can do it technically you could rebuild a VM from every every get checkout nobody does it I mean some people do it but it's it requires a lot of engineering and it's cumbersome because billing VMs takes forever and it's not practical right the only company that really I'd note that I know it really does it beginning to end with everyone bought in as Netflix where they're just it's and the result is you know they they have standardized on ec2 and if you want to even run an integration like a quick test on a given version of a branch or whatever you have to actually spin up ec2 instances so yeah so there's no merging the other thing also is one scenario where it's tempting is when you say well I've built my application it's all there I just need to rebase it against this new version of the distro right but the problem is you can't in practice if you're going to install a package from a given existing system that has side effects right you don't know for sure what apt-get install RPM install does it might just check for what's there so we can't assume the result will be the same so it's it's in some edge cases it might be useful but usually it's just better to assume you know there's a better way
Info
Channel: Twitter University
Views: 911,114
Rating: undefined out of 5
Keywords: docker, container, solomon hykes, dotcloud, twitter, open source, oss, twitter university
Id: Q5POuMHxW-0
Channel Id: undefined
Length: 47min 14sec (2834 seconds)
Published: Wed Oct 02 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.