Containers and Microservices: New Ways to Deploy and Manage Applications at Scale - Jake Moshenko

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'm Jake I'm head of quai at core OS and I'm here to talk to you about containers and micro services and some of the new ways that you can actually manage these things my goal is that you leave here with some kind of non-trivial realization so a lot of times we go to talks and we just listen to the same stuff over and over again I'm sorry if it ends up being that but hopefully everyone here will will leave with a a new realization or some new insight alright so this is me I worked for about seven years at all of the the big coasts that you see listed there from Boeing where I worked on flight simulator software Amazon I worked on e-commerce platform stuff and then Google where I worked on AP is after that my buddy and I left Google to found a start-up and then we spent about two years on that interesting tidbit we actually used docker in prod since version 0.3 gos were the scary days where every new release of docker brought something completely new we built Quay out of necessity so Quay is the product that I continued to run at core OS and Quay is a container image registry so it's the place that you store your immutable deployable artifacts before you send them out to your servers as I alluded to we were required by core OS and I continue to work on the same software so all in all I've about eleven years of experience and kind of these distributed systems if you don't know what core OS does we our tagline is that we're running the world's containers you may have also heard that we're securing the back end of the Internet both of those things are true weird we have deeply ingrained open or we're deeply ingrained open source company so we have over 90 projects with over a thousand different contributors those aren't all coreos that's what we call ourselves I mean we also have some enterprise solutions where we we bundle these things together or we provide a service that that enables this way of thinking so that's the tectonic and Quay products which I'll get to in a second so how many of you here have heard the term giffy it's no surprise what it stands for because it's Google's infrastructure for everyone else this is kind of a charged term there's a little bit of FUD it's like oh if you're not doing what Google is doing then like are you doing it okay enough and when I sat down and I thought about it it turns out that what giffy actually is isn't really defined anywhere so I kind of wrote down what I think the core tenets of Google's infrastructure are in a way that they're shareable and they're understandable by other people and so when I sat down these are the things that I came up with so first of all we have cattle not pets that's probably pretty familiar to everybody by now but basically the idea is that you don't put any special emphasis on one machine under giffy we also extend this to be any instance of a service any instance of a database basically we want anything to be be able to disappear immediately be that due to failure network partition whatever and have our infrastructure continued to run we also want declarative deployment declarative deployment means that you are basically telling the infrastructure what you want to run and not how to run it we want automated scheduling when we talk about running things at Google scale it's just impossible for a human to manually go and and push individual services to individual instances we need service discovery so if we are automatically scheduling these things we don't really know where they're running and we don't really know how to contact them we also need some shared services on our cluster and I'll get to this a little bit more later because that's a kind of interesting notion but the idea that we can have services such as like log aggregation that are on cluster and that all of our services can depend on an address is kind of interesting finally we need an incredible network to run all this stuff on so we can't just decompose our apps into micro services and containers and expect to run it on the same network that was running monolithic apps that will become more apparent later and finally Google has these incredible storage primitives that you can use to build your apps so these are things like the Colossus file system or the spanner immediately consistent distributed database okay so I'm going to talk a little bit about how I guess your journey to give he works the evolution of bringing gif ad or infrastructure so first we're going to start with a lowly web server right it's just a simple nginx web server fronting an app built on Rails a user makes a request goes through the web server goes through our app code let's say that this is an identity provider service so the only thing that we're doing is storing and retrieving users and that's all built on top of Ruby on Rails and it talks to a database okay simple enough we also run an API server in the same stack because it's convenient so when we when we run things together due to convenience this is what I'm talking about with the monolithic architecture and you know the lowly the lowly monolithic web node this is actually a really great way to architect your software when you're first bootstrapping right so if you've ever heard like Paul Graham say do things that don't scale um this is probably one of them but it has a lot of great features that software developers love right so it's easy to write these things it's quick to deploy them you just use the deployment mechanism for whatever your platform of choices you can fit the whole model completely in your head right it's not unreasonable that one developer might understand every single line of code that's that's contributing to this application and finally if you just throw it behind a load balancer you can actually scale it relatively well and when you when you start architecting your app in this monolithic fashion right you do the hello world for rails and you say wow this software stuff is easy I don't know what's wrong with all these other these other software architects like you know this this is just easy stuff why is it so hard so right like I mentioned when we start to get popular we can take that same exact monolithic app server and we can just throw it behind a load balancer all right so in this case we have a bunch of app servers and they're all talking to one database with your own behind a load balancer in that fronts all of the traffic but things get a little bit more interesting when we start getting popular right so what if our API traffic vastly outnumbers are our actual user traffic our rendering of web pages we're still deploying that webpage rendering software to all of our servers and that's taking up resources is taking up memory I don't know if it has background processes it might even be taking up CPU um but this is really unnecessary so now we've deployed several copies of the app software to our to our servers and we're not actually using them so this is these are wasted resources right these are this is something that we'd want to recapture this is one of the first things that people realize when they start talking about micro services is that you can actually scale these things independently they also provide a lot of great other features right like you can logically isolate your different portions of your app from one another and that'll give you a choice of technology so each portion of the app can be built-in in whatever language and whatever server you want and we can actually also get better machine utilization and this is actually somewhat different from independently scalable and I'll talk about that in just a second so the first thing we do to see what this looks like is we actually decompose our monolithic app into three services all right we have a web service which is fronting web page request which is fronting user traffic we have an API service which is talking to those machines those are the ones that recently got popular and we also have the users service which is basically just loading and storing users for all of these other things and that's the only thing that talks to the database and that can actually be really important because it gives us logical isolation like let's say we decided that you know a relational database just isn't working out for us for some reason the developers decide that they want to switch the database for a different store here I have the logo for Etsy D which is something that we build but the interesting thing is that due to decomposing this app into separate services we actually get logical isolation so the impact of that change of data stores stops at the individual service right so we can swap out the datastore and none of the callers are any the wiser it also lets us pick a variety of different technologies so we could have somebody say wow I could really optimize the API server if I built it and go just because go is fantastic for that kind of stuff I mean this is the developer talking and then the the web developer says I'm sick of rails you know rails is terrible I'm going to switch to Django and all of that is feasible because we've decomposed our technology into these individual services so this should be pretty much review at this point and then as I mentioned earlier they're all independently scalable so as the API service heats up right let's say somebody took our identity provider and they built the next Facebook out of it they don't actually care about our website at all but they're using our identity API is very frequently we can actually scale that service independently of the other services to reclaim some of those wasted resources I also mentioned better hardware utilization independent of independent scaling and so that's the ability to specialize the hardware for the given tasks that we're trying to scale out so rendering a webpage may be more CPU intensive than serving a JSON API so we can run those on bigger machines and it turns out we actually want to do a lot of in memory caching for our user service so we run those on a memory optimized version this is just this is completely a fabricated example but this is just showing how you can pick the hardware that will be best utilized so if we picked memory and memory optimized hosts for everything then our API servers would probably be wasting a lot of memory but once we make this decision to decompose into microservices we're left with a new set of challenges so when we decide that we want to build things when we decide that we want to embark on a service-oriented architecture having really strong interfaces is really the hallmark of a service-oriented architecture um one thing that people often overlook is the fact that when these services are calling in to each other you actually need a much better network than you do before you had service-oriented architecture and we talked about how machine specialization actually lets us utilize use less resources but machine specialization is also another thing that we have to pick it's another part of the cognitive overhead of what we have to keep track of and what we have to decide additionally the ops team has a bunch of new work right so the ops team now has to figure out how to get logs off of all out of all these services they have to figure out how to health check all of these things independently and additionally they have to figure out what happens if one of the services goes down but not the entire thing they have to figure out how to deploy these things and so there's as many ways to deploy software as there are software stacks so that can be a real challenge for the ops team and finally like all of these things need to be able to talk to one another and so there's a lot of networking overhead and making sure that everything that needs to talk to one another is actually routable so let's talk about some of these things independently so the first thing we have is we have these strongly defined network interfaces so in this example I've decided that these gold bars are going to represent a network interface between services so we take our users service that we decomposed which is the one that's responsible for talking to the data store remember and we've actually formally defined how that thing looks the choice of interface technology is kind of unimportant so you could use a plain old rest server you could use swift service you could use G RPC which is the new hotness from Google based on proto 3 if anybody has heard of that or used it but that's really it's really unimportant so the idea is to formally specify these interfaces in a way that other people can consume them independent of what the their interfaces are actually implemented using additionally each one of these calls represents more network traffic that we didn't have before so before the request came into our monolithic stack it basically stayed in the stack until we decided how we wanted to go to the database and fetch the data now we're actually generating more network traffic in between these services and there's always the also the idea that when you deploy these things for example in an auto scaling group are your services going to be one-to-one with the machines that they're running on and often the answer is yes right that's the really trivial way to deploy using an auto scaling group so as I mentioned earlier we picked a memory a memory intensive host for our user service a CPU host for our web service and a really kind of weak host for our API but this idea that we sort of have to manage all these things and we have to tailor the our use case our use of machines to the different use cases is another piece of overhead and we're left with a bunch of unanswered questions right so who's going to run these things you know when are they going to run them like how what are they even running right how do they get the thing started why are they running them and where are they going to run them right so often we'll just say spin up a new cluster deploy it everywhere but that's probably not the best answer at least we know who right so we know who's going to be running them the ops team is obviously responsible for all the stuff as always and the answer of when we're going to run it as always now right management never says ah you've got that awesome new software just let it chill for a little while so we're always going to be running this stuff so two of the answers are really really trivial but the other ones require a little bit more in-depth analysis so I talked about the ops overhead of deploying things so if we go back to our example stack we're using Django we're using golang in a previous example we were using rails for our user service so I just took a brand new Ubuntu install and I said how much software does it actually add to make this image capable of running rails Django a goaling based app and running it all behind nginx so I ran this and then I dipped the number of packages that were actually installed on the server so it actually installed something like 147 new uniquely versioned packages and these are things that the ops team has to understand and the ops team has to account for so each one of these new packages is something that may introduce security vulnerabilities each of these packages is something that has to be independently versioned and that have to remain compatible across all the different versions of software I also mentioned that the ops team has to know how to get your code on the box there are a variety of ways of getting software in a box right and if each team picks a different one it makes it really awful for the ops team again there are as many ways to run a server as there are to get it on a box so the ops team now has to have specialization understanding of how to run the things that they've even gotten on the box and so these all become sort of informal contracts between the operations team and the people developing the software and the question of adding a new piece of infrastructure adding a new server type adding a new language actually becomes physically manifested as burden for the ops team but it's 2016 now and we've had easy to use lightweight containers since at least 2013 so a lot of people have a lot of ideas about what containers actually are at their core they're literally just an isolation mechanism they're a way of saying these things should be run in a way that's isolated from other things that are run in a similar fashion we often use them for normalization as well so usually they're only sharing the Linux kernel and everything else is brought in and finally they're just a building block to having this like next-gen infrastructure this Google's infrastructure for everything out for everyone else so Google has been running on containers for you know a decade plus by this point and the rest of us are just finally catching up containers also usually bring a few extra assumptions so usually the container runtime that you pick will also prescribe an image format this image format is usually immutable so it's something that you build ahead of time and that you deploy as a single unit and they are often they often have you bundle your dependencies so bundling your dependencies mean that all of those things that the ops team had to install to prepare a server for your software now come with the container image they often also prescribe a distribution mechanism so this is the rocket fetch or the docker pull depending on which runtime you're using they often say how to run the things so they provide runtime metadata saying like this is the this is the binary entry point for this container and these are the args that you have to pass to get this thing fully running are these are the environment variables and they're usually lightweight so not all implementations of containers are and they're often Co scheduled as well and that's actually one of the most important things that we're going to talk about going forward so rocket is a container runtime from core OS it's has a little bit different guarantees and a little bit different priorities than docker I'd be happy to answer questions about it at the end of the talk if we have time and way as I mentioned is the team that I run it's a it's a distribution mechanism it's a repository for your your container images you can think of it as github is to your code kWe is to your compiled container images and we support both docker and rocket but while we're talking about what containers are we should probably also talk about what they're not right so containers are not usually fully isolated so there's some kind there's often some level of there's often some level of cooperation when we Co schedule containers onto a box we really can't say that these things are fully isolated against a hostile client an example of where this manifests itself is prior to the inclusion of user namespaces and all the container runtimes all of your user process limits were shared across users of those same ID number across all containers running on a given host um they're also not as secure as VMs so if you've built your infrastructure up based on virtual machines um they're not a drop-in replacement you can't run a hostile workload you can't run something that's actively trying to escape the container there there are demonstrated container escapes if you run things as route inside the container and there are rumors of container escapes if you run things not as route inside the container so these are things that you want to be aware of when you're picking a solution or when you're deploying your your first containerized infrastructure and they're also not a panacea right so as much as container vendors would like you to believe if you adopt containers all your problems go away it's not entirely true but what they where they really help is saving the ops team from this incredible burden of having to prepare a server for your software figure how to run your software and you know get the software on the box so a container image we can think of as our app code our dependencies and a manifest which tells that container runtime how to run that thing all bundled together as a single single deployable artifact and as I mentioned before the manifest might have information such as this is the entry point these are the required environment variables these are the port's that I'm planning to expose this is the data that I require and we combine all of those things together we get something that's easily deployable right so before we had all of those rsync ssh SCP you know puppet and chef bundle install pip all of those strategies those are now just one deployment strategy right we're literally telling the runtime hey go out and fetch that deployable artifact and get it on the machine right and when it comes time to run it because we have that rich set of metadata running it becomes easy to so running it is just saying here's your host you know you've got the software on the host go forth and make the container run and hopefully the the person who built the container image has put enough information in there that that's a single click that's a single operation so now with the containers this this base layer we've answered a few more of our questions right so we take our micro service images and we build them into a container image umm easy right now we know what to run and we also know how to run it because the container is self-describing it says this is the way that I want this is the way I want to run right if you follow the way that I've decided I need to run then I'll work that that's my guarantee to you that's the promise but we still have a few more questions to answer so like why do we run a container the trivial answer is because someone said docker run a rocket run right but that's you know that's not really a why that's more of a how so some of the reasons we might decide that we need to run additional instances of a container is because we may run out of capacity may be the host that the container was running on before died and went away or maybe there's a network part maybe we're launching something completely new right this is a new service we're entering a new space the developers have decided that they need a new component in order to bring their app to reality or maybe we have high availability requirements right maybe one instance of a container is enough to serve all of our traffic but if we lose that one instance we don't want our service to go down and finally we may not have a reason to exist in isolation but like our user service before other things are depending on us so that's another reason that we might actually want to run one of these services and then of course there's the we're right and so we're I think is actually the most interesting question that we can answer today so there are a variety of goals that we're trying to satisfy when we pick where to run something we may want to better utilize the hardware that we've been allocated we may want to better utilize the different types of boxes so maybe we've been given a whole bunch of memory memory specialized boxes but we don't really use a whole lot of memory so maybe we can load more things into that box we also want to isolate failure domains so this is one of the things that I think a lot of people kind of know but haven't really internalized yet so that's that's the thing that we're we're trying to accomplish and also we need to be close to things so even even if we've isolated all of our failure domains and we're utilizing the hardware to our best of our ability there's always going to be speed-of-light problems right so if you're too far from your data it's going to lead to a terrible user experience if you're too far from your dependencies it's not going to work out great and finally if all of this stuff works great at the border of your your of your data center but the user themselves are often Australia somewhere in your data centers here in the US that's not really going to lead to a great user experience either let's talk about hardware utilization so this is just a virtual box it it has four units of memory whatever that means call them gigabytes or Ram sticks or whatever and it has four units of CPU we can think of that as quarters of the CPU and we've got a service like Redis so Redis is an in-memory data store but it's very efficiently written so let's say Redis takes up a lot of memory but not a whole lot of CPU and then we've also got a service that's running a VPN gateway VPN is doing a lot of crypto crypto is very expensive CP in CPU terms but really it's not keeping track of a lot of data everything's very streaming everything is very efficient from a memory perspective who thinks they know where I'm going with this okay everybody's asleep after lunch I guess tada we've fully utilized our box in the most efficient way possible and this kind of bin packing choices that we can make when we're scheduling things can really get us higher into the utilization threshold than what we're used to so traditionally when we plan for capacity we plan to use like 40 to 50 percent of a box if we have this kind of intelligent scheduling and we have isolated failure domains which I'm going to talk about next we can actually load these boxes up much heavier I think we run our boxes we don't have them auto scale until they get to about 80% CPU so we can run everything a lot hotter and of course that saves us money and everybody likes to save money at least if people don't like to save money their bosses like when they save money so we want to do that we also want to isolate our failure domains right so in this case by the way whose architecture looks like this today where you have internal load balancers everything's independently scaled it's actually a pretty solid architecture so don't be embarrassed I'm not going to say like this is terrible ok surprisingly few whose architecture looks like the monolithic behind a load balancer still ok again it works for a lot of use cases it's remarkably resilient but anyway the thing that I want to talk about is the failure domain or the impact of losing one box so in this case each one of these logical boxes that represent a service are literally one physical box or one ec2 virtual machine for example if I lose just a single web box I've lost 50% of my capacity to serve web traffic maybe that's fine right maybe we've over provisioned by 50% and now the other box is just struggling to keep up running at 100% but it's really not great so what we can do is through intelligent scheduling decisions we can actually break these things down into smaller chunks and we can amortize the failure so we took our two web services and we we made them smaller right and we spread them out over four boxes so maybe each one is only capable of serving half as much traffic as the old one but in aggregate they still have the same capacity our API service was already lean and small so we just threw one of those everywhere that we felt like it and our users service is still heavyweight but we found out that by running our our servers hotter by running our servers closer to capacity we're actually able to fit more user services on a fewer number of hosts and now what we've essentially done is we've reduced the total number of loss to our architecture to a smaller fraction of each service so now when we lose one box instead of losing 50% of our web capability we actually only lose 25% of our web capability and 20% of our API capacity this is really easy to design around right so if you run on a hundred servers the the single box loss is going to be approximately 1% and this is a really great kind of benefit of letting these things be scheduled for you of course there's not just like a single box is not the only failure domain we have to worry about right so we can also lose a whole rack right the router on the top of rack can die maybe a couple of racks are sharing a power supply and that power supply overheats maybe we lose in our data center right these are like the kind of cataclysmic level events or more often networked partition and for political reasons sometimes we've even lost an entire country right so an entire country closes its internet borders and any of your traffic that was being handled in that country is gone this is a obviously a non comprehensive list of failure domains but these are some of the things that you can think about when you think about what does a perfectly scheduled set of services when it comes to my use cases actually look like but when we start thinking about scheduling at the rack layer at the cluster layer at the data center layer it actually turns into a problem that's pretty intractable to humans so in a single rack yeah I could probably pick where these these three services run and how often at the cluster layer oh um that's a little bit harder now I'm a human making you know hundreds of decisions and if one of these things goes down I'm I'm manually picking where to reroute them that's not great and I can have several clusters within a data center and that I mean that makes the problem you know exponentially harder every time you abstract to a higher level the problem gets harder for humans when I was building this slide I noticed that this starts to look a little bit familiar um any of you guys have ever seen a CPU die map of what a CPU looks like just like this very large scale integration problem this is something that humans really can't do without help so in order to make this problem more tractable we actually build higher level abstractions right we build tooling to help us so we build things that do these things for us and finally we build automation right so computers are way better at tracking millions of details than any human ever could be so the first higher level abstraction I want to talk about is the pod if you're familiar with kubernetes parlance or sidecar if you've heard about some other container solutions so what these things do they allow you to decompose your service even farther into helper containers so you may have your appcode which you're very familiar with but you may also have an ops team that ships say a log shipper container and a log shipper container is something that sits next to your app and takes the burden of making sure that your logs get somewhere useful in case this thing fails you also might have proxies the proxies may be doing auth the proxies may be doing load balancing you can have middleware so middleware may make intelligent decisions about caching for example and one of the interesting things is that usually these things share in network namespace so that means that if these things are getting Co scheduled on a box they can actually rely on being able to find each other at like localhost : some port and this is actually a critical portion of making these things easy to compose and these things easy to to deploy so just a simple example I talked about the log shipper sidecar and I talked about how aggregating logs is a problem for the ops team when we decompose into micro services so I've taken my original three service architecture and I've decided that everywhere that I run one of my services I'm also going to run a sidecar or I'm going to run these things in a pod and that pod is going to have a container which can send my logs off to somewhere somewhere off box somewhere reliable right because we want to decouple ourselves from a single box failure domain in this case I decided I want to run the e lk stack a lot of people probably familiar with that but it's basically just log lea decoupled into open-source services or you know Google's log infrastructure decomposed into open source services and that thing has its own database it's keeping data in its own off box area and it's just another network service right so I've taken the problem of aggregating logs I've solved it once for all of my containers the interface between the individual containers and the sidecar Services is just the network so this is something that I'm already familiar with and then those things themselves are talking to another well-defined service so it really becomes this this composable set of components even below the service layer another thing that I alluded to in what actually makes this giffy infrastructure is the ability to deploy things declaratively and when I say declaratively instead of saying hey server go run this software I'm saying hey infrastructure I want this software to be running and if the software isn't running the infrastructure has to go and do whatever is necessary to make that run and in this way I'm actually describing my app I just threw an example kubernetes pod manifest on the right but it's basically just saying or replication manifest it's basically just saying I want to nginx is running somewhere and you they come from this container image and when I run them I want them to be called nginx right and it's really just a way of abstracting ourselves from the nitty-gritty of how things are being accomplished and moving up to the layer of saying what we want to be accomplished and this is actually how we answer the question why right because we're answering the question why is and I've described the infrastructure to look like this right and now you the the infrastructure manager software need to make the decision of how you satisfy that description so I guess we've kind of been building up to this for quite a while but you kind of need this whole cluster scheduling mechanism that sits on top of all your hardware and makes these decisions for you right so kubernetes is one particular flavor of cluster managing software there's also mezzos Google uses a piece of software called Borg internally that's these are basically just one API that you can talk to when you want to push descriptions onto your cluster the scheduler itself is responsible for deciding what services are actually running and where they're running and it does this by satisfying constraints that you've defined ahead of time so you might say like I don't want these things to run next to each other or I do want these things to run next to each other or these things should be in a different rack and really you want to have these primitives exposed to you to be able to define these constraints into how things are are scheduled kubernetes is an open-source project originally started by Google it's now been donated to a foundation and we at core OS are upstream contributors to kubernetes and another important thing that I failed to mention is that the cluster scheduler software reacts to changes so if something fails if something if our hardware architecture changes if we suddenly have more hosts or fewer hosts that that scheduler is still responsible for making our description of what the world should look like reality regardless of what any of these types of failures are what any of these types of changes are and then that could also include I as a developer say that instead of needing two of these things I now need three and the way it accomplishes this is by just running a tight loop that says you know what am I currently running and what do I need to be running and and satisfying the constraints to fix that that Delta of differences and so we finally answered the question of where these things are going to be running right so we actually just let the cluster scheduler pick forests so now everything is running we've decided you know we've adequately set up our constraints to make sure that it runs in the right spot we've described what we want things to look like so we're done right seems like it all of our services are running but one of the things that we actually haven't talked about is once the scheduler decides where these things are running how do we find our dependencies how do our external user-facing load balancers find the services that we're running so this is another abstraction that we we build on top of all of these containers running on various hosts called service discovery so now whenever the scheduler makes a publishing decision the scheduler also turns around and informs somebody else of the decision that it made in the kubernetes case the default is to have it stored in sed and have all of that information surfaced through all of the traditional DNS tools that you're familiar with using a service called Sky DNS all of this is pluggable you can replace it with whatever you want Skai dns Excel is also an open-source service and I just find it interesting that they literally call themselves Skynet which is kind of a pretty charged terms pretty charged term when we come to letting the machines take the take the power away from us but when we put it all together we actually have our existing constructs for resolving things like get hosts by name those continue to work so now we can say get hosts by name find me an API server and it'll actually return an IP address now that we've got that IP address though we still may have a challenge of being able to access that IP address from where we happen to be scheduled in the cluster so another higher level abstraction that we've built is called an overlay Network the goals of an overlay Network I don't want to dwell too much on what an overlay network actually is but the goals and the problem that it solves are letting containers find each other easily every time we make a scheduling decision to create one of those pods that we talked about earlier and we said that those have a network namespace we're also assigning that network namespace a unique IP per pod and this allows us to continue to use the the standard network networking approaches that we've been using for decades right so our HTTP service can still bind on port 80 they don't have to run on some high numbered port and then run a proxy to get to the right place we we're also aware of the actual IP address as it relates to other containers that we've been given right so if I bind to a particular IP address I can turn around and tell people that I'm bound to that IP address and that's useful to them this is often not the case if we have several layers of routers or if we have any kind of network address translation in place so the overlay network really gives us a net work that's scheduling and container specific on top of whatever physical network that we that were built on top of and it also eliminates one of the fundamental constraints of standard Linux boxes which is that we often are limited to one IP address per host this really isn't going to work if we want to have these Co scheduled things that can bind on whatever port that they want whatever port that they've been pre-configured to we really need to be able to have multiple IP addresses per host and an overlay network helps us solve this problem all right let's see how we're doing um so we have a better set of answers some of them have changed a little bit so the who is no longer the Apps team right now we're letting the cluster management software make all of our decisions for us of course the Apps team still has to run the cluster management software but we've abstracted the problem one layer higher to make it more human-friendly the win is still the same we pretty much always want to be running the things that we need to service our customers the what has changed a little bit instead of running an individual container we are now describing a set of containers that need to be run together in this pod or sidecar abstraction the how is still the same right so these higher-level cluster scheduling abstractions still just run containers at the end of the day right and they Co schedule them onto boxes so that hasn't changed the Y is still the same from when we originally answered it which is the user wants it to be so and so we make it so and nowhere is wherever the scheduler decides is best wherever it can satisfy your constraints about failure domain isolation number of running services capacity what type of box we need to be running on etc now let's take a look at our scorecard for giffy what did what did we do right and what did we not quite get so we've isolated all of our failure domains which means that everything down to our services are now cattle no longer pets we're doing declarative deployment which is much preferred to imperative deployment because it kind of puts all the failures into one common resolution mechanism which is it's not running and it needs to be so make it so we've got automated scheduling so kubernetes is making decisions for us based on what we've described it needs to be running and what is actually running we brought in a service discovery mechanism um I talked about shared services on the cluster a little bit when I showed the logstash stack but if you want to extrapolate from there you could run monitoring software in a centralized location you could run your metrics software in terms of like turning your monitors and useful dashboards you could run data warehousing software where you push values into once a day and go back and get those things later and those things can be found at a known address based on the service discovery we have this software-defined network it's a really powerful network because we've you know exponentially increase the amount of network traffic that we have in order to get this isolation so if you think about it we've essentially made a network traffic failure domain trade-off so we've traded we've traded better failure response for more network traffic just kind of an interesting fall out but one thing that we didn't really talk about it all was storage so when we're running these things on cluster we can't really store things within the container because those things are really ephemeral we can't really store anything on a particular box because if that box goes away we're right back to having the same kind of failure domain the single box failure scenario as we had before so I think storage is where you're going to see a lot of innovation coming over the next year or two as well as improvements to the scheduler so a lot of the higher level abstractions about scheduling around a particular rack those are things that are still kind of clunky to do with the cluster scheduler software we have today and storage is still kind of an open question so you're continue your current storage solutions will continue to work so if you're using NFS you can actually bind your NFS drives in the same way but it's not really this like cluster aware resilient type of storage that we're expecting you may be familiar with a project called Steph Steph is kind of software-defined storage solution as well but it's again not great at running on cluster because it relies on being able to store persistent data to disk we are seeing new cluster aware storage come out so one of them is like cluster HQ has a solution which uses ZFS snapshots to migrate your data around and an interesting one for us because it's built here in New York City and they're friends of ours is cockroach DB so they're attempting to bring the the spanner style immediately consistent database to cluster aware and do all the storage cluster where while making it resilient hence the name cockroach finally I'd be remiss if we didn't talk about what we sell at core wasps so we sell a a vision we sell a specific slice of all of these technologies combined together into a stack that we call tectonic everything on this slide except for the Quai project is open source so we even the things that we sell we are truly committed to open source and we are truly making sure that these technologies are available even if you don't want to buy our supported version of it and that you can build this goodness up from the ground if you need I haven't mentioned Claire yet but Claire is a vulnerability scanner that we have to isolate and identify those vulnerabilities in all the packages that I listed earlier when we start bundling dependencies we've moved the problem from an ops problem to a container image-maker problem so Claire really helps with identifying and addressing those vulnerabilities if you want to hear more about this type of new-style infrastructure we're having cora West fest in Berlin this year it's just under a month away but I believe there are still a few tickets available if everybody wants a last-minute trip to Berlin highly recommended and that's all I had thanks um I hope you I hope there was a least one non-trivial realization in there I think the the failure domain isolation is something that we're all kind of aware of but nobody's really thinking about in concrete terms so that that was kind of one of the the main takeaways that I hope you guys had and unfortunately we have about three minutes for questions so I could probably take one or two anybody Bueller you want a t-shirt is that a is that a question yeah um if I understood the question correctly it was we can already get a lot of these same kind of abstractions by running virtual machines is that is that right oh okay okay so now that I understand the question better it's this abstracts over a single data center are we a couple years away from the JVM for the data center this is already something that Moses is attempting to do with their DC OS so they're trying to abstract a set of primitives instead of abstracting the or they're trying to advertise a set of primitives that you can that you can kind of just program against the data center so that that that's a project that's already attempting to accomplish that what you lose with that is you lose the the decades of knowledge that we've built up about how to run servers and how to how to do networking and how to bind ports um so it you really have to adopt the new paradigm pretty much 100% um but yeah we are abstracting over the data center and there's actually if you go into the kubernetes github repository there's actually a project in there called uber Nettie's which is abstracting data centers across the planet right so now I don't want to stop saying like how can i how can i schedule around one machine or one cluster loss in a data center but I also need to start making decisions about how I can run these things in a geographically distributed manner in the same declarative kind of fashion so I might say something like this needs to be run in three different places on earth but oh by the way bandwidth between data centers is very expensive so I also want to avoid shipping data whenever possible so these are things that we're looking for I don't know if that answered your question but I certainly tried anything else I think we're officially at a time but okay all right thanks guys you
Info
Channel: O'Reilly
Views: 35,498
Rating: undefined out of 5
Keywords: O'Reilly Media (Publisher), O'Reilly, OReilly, OReilly Media, containers, microservices, microservice, microservice architecture, jake moshenko, deploy, applications, Docker, deployment
Id: 7ZFBn_e27o0
Channel Id: undefined
Length: 49min 31sec (2971 seconds)
Published: Thu Jul 28 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.