Mesos and the Architecture of the New Datacenter

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay why don't we get started this is mezzo sin the architecture the new data center I'm gonna make sure that I preface this by saying this is definitely planned for beginners so anyone looking for me to deep dive into medicine Turtles I'm going to be disappointing I apologize in advance but why don't why don't we jump in and get a little started so Who am I I'm Thomas I'm a product manager at mesosphere and I'm a bit of a bear there was there are some bets going on I'm a little bit known for not wearing any shoes so for everybody who wanted to know yes no shoes on the platform that's the way I roll I've been doing distributed systems for a long time you'll all notice my beautiful neckbeard I've not actually been a product manager most of my life I'm an engineer they just told me I had to do product because my code wasn't good enough with no no it wasn't a nice way it was a very much no we just won't take your code at all deal with it Thomas anyways I work for mesosphere data center operating system mezzos all kinds of cool things I'm I'm gonna be for this talk not going into actually the architecture of a new data center as someone who's done ops I can go and rant externally or eternally about all of the things related to architecture but I think that there there's gonna be at least five or ten other talks here while we do it so instead what I'd like to do out my slides are mixed up what I'd like to do is talk about why we need a new architecture from the developers perspective and what it does for the developer and how far can we take it so going back a slide here I've got a bitly link I created a DC OS cluster anybody who wants to go play around with it please do put that in that's the cluster I'll be using so try not to kill it for me or else I'll be very sad but otherwise kick the tires while we're doing the talk I'll probably leave it up for a while later but you know you're welcome to give it a try while we while we go along so we talked a little bit about the agenda already who uses a data center and I think this is the most important thing for thing for us to think about because there are a ton of people who use data centers but there's two that I think are really important particularly for this talk and being a product manager I care a lot about personas so at mesosphere we have a couple personas that are near and dear to my heart the first one is Dan the data center operator Dan wants to make wants to have his data center happy he wants to make sure he's got enough capacity to run everything that's going on and Dan really hates getting woken up at 3:00 a.m. who's been on a production service been woken up at 2:00 a.m. and just grumbled and hated it right like I can't tell you the number of times that I as a operations person got woke up because of some I'm not gonna say developer but since I wrote the code I'm gonna say developer and it then took two hours to debug and then you have to wake the developer up it's really painful so what we like to do and this is why I think the this new architecture is really important is Dan isn't in charge of the applications anymore he doesn't have to take care of or worry about the health of the application he worries about the top level services he worries about the marathon is it running he worries about Jenkins is it running he doesn't actually worry about the apps that run on those and this is a really important point because it's changing the responsibilities of your traditional data center the number of startups that I've worked at where you pitch code over the wall to a Dan who puts it together eventually makes it into some scripts and runs it and as a developer you have no idea what's going on is I mean it's always something that happens and so there the last point there is he doesn't actually want to know what individual workloads are Dan really doesn't care about your java app he just wants to make sure that it's up and running and the CEO isn't bugging him about having major downtime so then we move on to Alice and Alice is our app developer Alice wants to create new apps in a rapid cycle in fact Alice wants to be self-serve there's no reason why Alice wants to go and wait for months weeks for a deploy to happen because you know at some point if you throw your coat over the wall for someone to do a deploy you sit there and wait and when it goes into production you've forgotten what code you wrote and now you have to debug this app that you wrote a month ago it's painful I I really hate it and so that is that is basically Alice in a nutshell so what is broken in our data centers let's start off by talking about how you deploy to one server again I'd love to see a raise a hands raised how many people have ran production apps in screen on their cluster yes it's not just me you don't need to use an init no no you just put it in the screen and you walk away in fact I'm gonna say that if you're running it on one server that's the right way to do it it's easy you just go start the app you don't need to think about it anymore you SCP the bits over and if you're missing a dependency it's one box all you need to do is install it by hand so this is our mainframe world this is back the way it was this is you know it was great but at some point our apps needed to fit on two servers and when we hit two servers screen started to not work real well ssh server 1 ssh server 2 ssh oh geez i don't want to type ssh anymore and then you've got cluster ssh and you've got i mean but the number of tools there to just ssh into n number of servers is is intimidating but apps don't fit on to servers anymore either how do you get to ten servers well when we needed to go to ten servers we created chef puppet Saul ansible cfengine pick your code deployment tool of choice and to be honest it's pretty good when you've got ten servers it yeah it works it takes you a while you write some code to deploy your code cuz dude I heard you like code to deploy code to run your code so that you could code all the time but it is but it is a little bit fragile in fact one of the one of the first products that we did at mesosphere had was basically a implementation of puppet to deploy mezzos clusters into the cloud and we saw on average 25 percent failure rates of creating clusters and the reason for that was you have a random bun to Mir in Amazon that just doesn't want to work what do you do another big one that we had was digitalocean just happens to not have instances sitting around for a couple minutes what do you do it really ends up being fragile and when it's fragile and you do a deploy it comes up on 75% of your boxes and you hope that it's right and then the 25% don't health check and then you have to sit there and debug them and again it 10 servers you know it's not that big of a deal you go and you kick a couple servers or again some we're gonna call him operations guy this guy goes into the server and fixes something or does performance on one box didn't tell anybody and now you've got this cluster of snowflakes of pets that you've got names so we've got Fido on the left and Billy right there and and you know you get to be you get to be personal with them once you get to 100 servers this really starts to fall down you end up with constant 2:00 a.m. debugging sessions I was actually chatting with an application developer friend of mine about working at mmm eBay and he said that the eBay deploy Windows you had a 2 a.m. session the code would go out and you would just sit there and debug and debug and debug and be on a conference bridge and it was one of the most painful things that he had experienced in his life and if we think about it deploying your code is the thing that matters because deploying your code gets it to the customer and so if you are scared of deploying code it slows the business down it makes your developers unhappy it makes your Ops guys unhappy it's again it's it's painful it hurts so let's talk about deploying to 10,000 servers if you've got 25% failure rates you need an army of Dan's because 25% on 10,000 servers that requires days of debugging to get your application up and it's basically never-ending you end up putting out fire after fire after fire after fire to the point where you can't actually do anything with your infrastructure you're just trying to keep it running and you know once you're behind the eight-ball it's almost impossible to get caught up again so obviously there is a light at the end of the tunnel we happen to be here at mezzos con isn't it great so here we go this is on the left is a Sun fiery 25k I decided to do a little bit of googling this mainframe came out in 2008 does anybody want to guess how much you can buy one for now eBay has a current auction for $8,000 but I think the most important thing to look at here is how are these two different you've got 144 cores and a terabyte of memory and you have 144 cores and a terabyte of memory and and I understand that this is high level and it's a bit abstracted but but when you think about it its resources and it's scheduling and that's what mezzo state so this is an awesome quote from Ben 64 cores or 128 cores in a single computer it looks a lot like 64 or 128 hosts why are they different why have we created this block in our brains that says that this is something different well Mezo says it doesn't have to be as soon as you start thinking about it as one big computer there's all of these amazing things that start to happen because you manage your resources globally now so instead of having statically partitioned a Hadoop cluster and a spark cluster and Java cluster you manage your resources globally in in fact risa tea planning is one of the hardest things that you do traditionally as an Operations team because you need to go and look at what everything is doing and and how they're related and make sure that you order servers just in time for any of these apps that might be running on your data center but with this you just put an alert that says okay we've got 25% capacity guys it's time to go order a rack you can actually just send an email to your dell rep and say okay another rack please I've got less capacity or I mean you can even go more advanced in the cloud you can just go launch more servers in the cloud it makes auto scaling become this thing that is not really an issue anymore so we're talking about deploys we're talking about app developers and I'd like to bring everybody back to screen I know that I said screen was a little ridiculous but it's so easy you just ssh into the box you start up your screen session and you have a deploy and the most important thing again is getting code out to customers because that's what makes people happy so why don't why can't we do one computer deploy as well we can let the tools take care of it all you need to do is define the resources that you require state where to get your code and stand back you'll notice that this is a sleep container I'm going to say that the killer app for the data center is distributed sleep you heard it here first at least I know how to ride run distributed sleep really well so let me let me go and do a bit of a demo here I'm gonna show off DCOs but pretty much everything that I'm showing off is meso specific I'm just really lazy and know how to use DC OS so first let's focus a little bit on that that one big computer we've got 24 CPUs and 82 gigs of memory I don't have anything failing and I've got a service that's running so pretty much within 30 seconds I know the health of my data center I know what's going on it's a big computer I've got six nodes I've got resources why do I care about anything else and then we can go over here and look at marathon and luckily no one has started some horrible app that would embarrass me but let's go actually actually start an app here in my fancy terminal if I go take a look at my container JSON just a simple sleep container again basically what I showed off go docker Thomas our sleep container is is very special it is a 3 megabyte container that only contains the sleep binary but it's deploys become just as easy as screen here DCOs marathon app ah start whoo nobody saw that well of course there we go and just to show that the proof is in the pudding sure enough distributed sleep is happening in front of us in real time here we are started up and you know it's easy to scale it's simple we can just hit 20 here not only that but it's it's easy to update it if I I went back to my JSON here and put in a new tag for this docker container all I'd have to do is update it and marathon and my one big computer would take care of deploying that and running it on my entire data center so let's get back to the slides here sure enough I've got a bunch of instances running so I've shown deploys it's easy that's fantastic kubernetes does this yarn does this mezzo sley does it better that's why we're here but let's let's not stop there let's let's go to the next level we've shown off how application developers can can self-service themselves for applications but the applications today that it's more than just sleep distributed sleep is a very unique application that only works for me come to find out some people actually want to have business value and what they run and and so applications are not single applications anymore we started out with a lamp stack and we had Apache and MySQL and you know life was good but now we have Cassandra we have Kafka we have kubernetes we have SPARC our applications of today to deliver the really big business value are made up of hundreds of these things and they're coming out every day and I just had to throw in web 3.0 I think we're on web 5.0 now I'm not sure but how do you deploy a new service so we talked about an application but that's pretty simple it's pretty easy to figure out how to get a sleep container onto a box but how do you get a Cassandra ringg running in a data center well you go purchase new servers and you know in the cloud you have to go purchase new servers they just happen to come in 30 seconds instead of two weeks and then you know you have to rack and install it you have to write the deployment scripts you have to hook up the monitoring and the alerts and at some point this is an amount of work for your poor Dan in your data center of weeks maybe even months because he needs to figure out this Cassandra ring and how to work and how to make it you know run in production and so the bastard operator from hell comes out and when you come and ask for a new service Dan goes there's no way no way I'm gonna let you run that service I'm it's just it's not gonna work out and that sucks as an application developer you want to go run the new hotness you understand that it's gonna make things faster and it's awful to hit this hard wall that says no I'm sorry you can't do it well mezzos can help why because it's a two-level scheduler and two-level schedulers are magic because we've abstracted the resources from a cluster we can have something that orchestrates are very specific application and so one of the great examples of this is the HDFS framework that's out there right now HDFS needs to come up in a really specific way you can't just go and create HDFS containers and tell something to go launch it across your cluster you need to go bring up the name nodes then you need to bring up the data nodes then you need to make sure that anything you're running is on the data nodes so that you have locality so that it works and that's where the two-level scheduler comes in it makes it so that that complexity is taken care of there's a talk later today by Joe Steen about the Kafka framework the Kafka framework is Joe Steen saying this is how you run Kafka in production and and as an annexe operations person that makes me really happy and comfortable it means that I don't have to think about deploying this I don't have to get new servers I don't have to monitor it I have somebody who's an expert that said this is the code this is the automation why don't we run it so let's create a package manager this happens to be a unique feature to DC US but the package manager is actually nothing more than a marathon JSON that you can configure very very complicated so basically this just launches the scheduler and lets the scheduler take over the rest and in fact there's a bunch of packages so if we go back to my command line here and I do a DC OS package search we get to see that there's actually quite a few fun packages here in fact there's two in here that are particularly interesting kubernetes and swarm in the past when you wanted to go play around with a new tool again you had to wait as an application developer wait and wait wait but now you can go to Dan and Dan types DCOs package install kubernetes yes I'm I'm really sure that I'd like to do that thank you and now I have kubernetes I can go use kubernetes has anybody here actually tried to set up kubernetes on their own how many hours did it take you to get it running days right well you're better than me it took me a week it was embarrassing but we have it up and running here you know in seconds and and there's more there's all kinds of things here just like the packages you would expect on your operating system they come up they're running and if we go take a look at DCOs you'll notice that we're consuming resources we have memory allocation consumed we've got tasks running it it's you know coming up it's doing what it's supposed to in fact why don't I just go show off the marathon UI here we have kubernetes it's getting launched and docker containers are you know big it'll take a little while it'll come up so that's that's package management and because of that dan says yes dan doesn't need to take care of it anymore he just makes sure that it's running he gets out of your way so where do I start well I've got that cluster up and running I honestly would encourage everybody to go check it out if you're interested that's pretty fun to kick the tires if you'd like to create your own cluster DCOs is a click away at mesosphere comm slash amazon that's the current cloud provider we also have some Azure targets that are available you can launch an application we've got a tutorial for our Twitter clone aka linker so that you can all get each other and it is and it's not just a trivial web app it actually brings up Cassandra it brings up rails front-end it gives you a load balancer and routing it gives you spark it's it's a real legitimate application and as someone who's done a little user testing on this it'll take you about 30 minutes which is just awesome it gets me really excited and then finally we've got a tutorial on installing services like I did the command line is just that easy all you need to do is install services and with that I'd strongly recommend everybody go put mezzos in production obviously DC OS is awesome but I'd strongly encourage you to do at least mezzos you're all going to be happy about it and with that questions anyone anything at all have I made some wonderful oh yes sir yeah yeah I mean let me let me actually show it but let me show you how it works so the question was how does the package manager work how do you get it into DCOs why is that out of the box so if we go to uh I didn't even spell that right I need to just autocomplete things no that's not even it okay that's it I'm gonna type this in really quick here ah stuff in things there we go so this is the universe this is the source of all packages in DCOs and it is as simple as let's go look at one of the packages why don't we show off the marathon package here it's version just like you would expect a package manager and then there's this marathon JSON and so this marathon JSON is the marathon JSON that you would use to launch it so in fact if you didn't want to use DCOs at all and I'm probably horrible for saying this you could just get this marathon JSON and run it on marathon and you would be happy the only thing we're doing here is making it so that it's configurable you can set the CPUs you can say whether you want to use HTTP or not and you can tell it where everything is there's a lot of wonderful configurable features in marathon but that's pretty much it the universe like the CLI just pulls this down as a JSON blob and runs it we like to keep things simple make them work any other questions go yes so the question was what is kubernetes why does it on mezzos how do they work together and so let's let's get back to that two level scheduler again mezzos only cares about resources it says I've got resources here why don't you use them that's all it does kubernetes does everything so putting them together mezzos tells kubernetes here's all of the hosts you can launch things on and kubernetes goes and launches them and so it's actually a pretty simple integration because mezzos just makes the resources available and then kubernetes can consume them kubernetes is a fun one the one that I really like is myriad which is running yarn on top of missus which kind of breaks your brain but it's really the same thing because yarn is that monolithic scheduler instead of a two level scheduler you can put it on top because mezzos makes it aware of the resources that are available and then yarn decides to use them for your hadoop for your hive for whatever you want in that Hadoop ecosystem so so the question was what happens when resources become unavailable and that is entirely up to the scheduler so with marathon if a task dies it goes and waits for the next resource offer and launches it again for project myriad it's pretty much the same thing myriad goes and grabs some more resources and goes and schedules the jobs that died over there but the cool bit here is that you are not stuck in this world of chef and puppet it's dynamic because the framework understands the topology and the resources of the cluster it can react to them whereas your chef and puppet chef doesn't know when a node went down it just knows when you run it again and because of that it can't react to the failures so one of the cool things about the Cassandra framework is that when a data node goes down it brings up a new one and it rebalances the data again dan doesn't want to get woken up at 3:00 a.m. in the morning computers take care of that for you it's that it's that like codification of the operations process that gets handled any other questions yes can i contrast it yep the in fact at some point kubernetes is a competitor to marathon I will again not say the party line here kubernetes is got some really great opinions pods and labels for example are really fantastic a lot of the things they're doing with the durable pods are really great and so it becomes a pick and choose because you can run as many frameworks as you want on mezzos you don't need to have this cluster that just launches apps you can have a cluster that launches apps and runs Cassandra and runs SPARC and runs kubernetes and does whatever cool new thing you want in fact one of the things that as a product manager I'm most excited about is this ability for partners who want to run things on Prem not needing to sort out what it takes to run on Prem you can put a package together and DCOs package install it and deliver it to your customers in a fast easy way for a distributed system that nobody can do today again like doing a trial of your software for Cassandra for example is a significant advance in its significant investment for the organization it takes a week of somebody's time two people's time to get it up but if you can just type DCOs package install Cassandra you can go kick the tires you can go check it out you can see whether it's something that you want and so no longer is it reading a bunch of white papers to say okay I think that this is the right thing to do you just go run it try it see if it works with your app something I get again super excited about ok any other how are the frameworks managed by marathon so a marathon is a meta scheduler and that breaks most everyone's brains I've got a designer at mesosphere I'm still trying to get this concept across - but what you need to do is you need to remember that you can disconnect a process that's running as a scheduler so let's pick on kubernetes for example because I showed that off kubernetes has a scheduler that runs and it has tasks that it runs and the scheduler needs to run somewhere so if we take marathon out of the picture how would you run the kubernetes scheduler you would go write a system d-unit you would go run it on one of your hosts and it the systemd unit would make sure that that box got restarted it would make sure that everything worked and then and then the tasks would get launched by kubernetes so by going and putting it into marathon all we've done is move it from a single box systemd to a cluster wide systemd I'm not sure I understand right and so that so what happens is the scheduler launches those right and so the thing that that marathon manages is only the scheduler so the scheduler process comes up it registers with mezzos and then it goes and creates the master and all of the other nodes that it needs to actually run kubernetes and then those are managed and owned by the kubernetes scheduler itself so so again marathon is doing nothing more than in it for your data center computer I mean it again it's it's the same way the with DCOs all we've done is claimed that that marathon is an init for your data center and just launch your schedulers on it and again it's it's a bit meta but that's the way it works mm-hmm sure sure you go and shut the tasks down so so entirely seriously so the question was you're out of resources what do you do or maybe even a better one is how do you make sure that each framework gets the resources that they should and so in mezzos there is the the scheduler itself inside of mesos does something called dominant resource fairness and what that basically says is is that each framework has its own max share in fact I might be able to show this off here if we go to the mess you eye itself and we go over here to frameworks watch this just not work anymore it does so you'll see the the max share here so mezzos is tracking the amount of resources that have been consumed on this cluster and so you'll see that my kubernetes here is consuming six CPUs and 26 gigs of memory and marathon is only consuming three CPUs and two gigs of memory and so because marathon has a smaller max share it will get the resource offers first and so marathon will get the first chance at that now that isn't a guarantee obviously and that's where one of those new features that been talked about earlier in the keynote comes in that's where quota comes in so with quota you can say no no really kubernetes you get 100 gigs of memory that's all you get take care of it great yes DRF is always going and it is just again all drf does is reorder how the resource offers are offered to a framework that's all it does and you can actually configure drf with different weights and so you can say I know kubernetes has 48 percent of the cluster wow that's really going anyways I know that it has 48 percent of the cluster but it can it can have that and so I still want to give it resource offers but it's not sufficient again for guarantees that's where quota really comes in okay what time for one more question mm-hmm so the decisions for for whether to run a process or kill a process are dedicated entirely to the framework itself and so what that means is for marathon it will never kill a task it will never release those resources unless it's told to now Chronos in comparison is a cron distributed cron for your data center and so it has a lot of logic in there to actually kill tasks that run too long but that again mezzos itself only cares about resources it says hey I've got some resources here would you like to take advantage of it and then it waits to say whether you want to consume those resources or not and that's because of that's that abstraction we can do a lot of cool composition which is why mezzos is something that you can put kubernetes on top of you can run the batch workloads because your batch workload scheduler will run it and knows how you want to run it and your long-lived application scheduler knows how to run that as well okay I think we're out of time thank you so much everyone and really appreciate it have a great rest of the day
Info
Channel: MesosCon
Views: 6,932
Rating: 4.8545456 out of 5
Keywords: MesosCon Seattle 2015, Mesos, MesosCon
Id: M_KjGMImOmA
Channel Id: undefined
Length: 36min 16sec (2176 seconds)
Published: Wed Aug 26 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.