Journey to Docker Production: Evolving Your Infrastructure and Processes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello good morning so of course like the rest of you I didn't know it was in the keynote today and it's very timely to talk about legacy apps in production because that's what we're going to talk about today not necessarily exclusively legacy but we all have apps right we all have existing stuff and so anyway I'm Brett a little bit about me I am now a longhorn writer as of last night and I've been a docker user since around 2014 sometime I think I was a dev and ops user of it we actually were using it in a co-founded startup that I'm in and or was in and in the startup didn't outworked work out but our tech was awesome so soccer was great helped us out a lot and that's when I started getting the bug for fun containers and so now I'm an independent DevOps consultant I guess you could call it so I help people I spend my time all helping people go to a production docker so I thought I would wrap all that up over the last two years into some lessons learned and get into some actual practical stuff a little bit so I'm assuming you're here because you want to go darker production and not I'm here not here to convince you that production is a good idea you hopefully did that before you got in here and this is after you've made that decision and you're like okay now what's next how do we how do we actually do that and so you've already used docker you're already actually probably using it in dev or maybe even a little test and you're taking that next step and a lot of times the problems are you got these decision barriers you need to make basically they're there in your way because you need to make a bunch of a whole list of decisions some of those are actually true and some of those are not actually real things that you need to do there are sort of fictitious requirements that go into the project that you think are mandatory so we're going to discuss those and try to get some of those out of the way for you and break down some of those barriers and hopefully I will help you so here's our agenda and we're going to get started so whose stall who saw the start Star Wars trailer anybody whoo all right I'm excited because after I'm a x-gen so that means now as an adult I can actually see what happened to Luke so I'm here to give you sort of a new hope around container rising as is or as close to what you currently do as possible and basically in your projects the goal here is that every decision point question if that thing that you're about to decide on needs to change for the sake of deploying a container because containers are fundamentally not a whole lot different it's still a process running on a right so sometimes we put these decisions in that we have to make that aren't actually necessary and I just call this limiting your simultaneous innovation so fancy phrase are basically saying reduce scope limit the things you're doing and there's problem areas that I commonly see people dealing with and that they don't necessarily need to make a part of this project when they're going production and fully automatic continuous integration and deployment is an actually requirement might seem obvious to a lot of you but I see a lot of cases where just because you have to actually go in and maybe type some Ducker commands and you'd maybe want to automate all that if it's not automated today it doesn't have to be for containers containers can operate just fine without them and scaling is another issue where people want to do all the things at once right we want a micro service we want to fully scale up and down and we want to deploy all that stuff magically and these aren't actually container requirements this is you just adding projects into this existing project and slowing down your production so we have service discovery issues where people are thinking they need to implement service discovery for their first three servers or something like that and that's not a that's another requirement that's not a hard set requirement if you're not familiar with service discovery it's usually when you get the containers and agile deployments you start having challenges around things changing locations and that's definitely something you're going to have to deal with eventually swarm helps you a lot with that but it's not something that you necessarily need to add as a hard requirement you just kind of need the experiment to figure out if you can get a get around it for the time being because the goal here is to just get into production and then the learning process because you're going to learn a ton on day one and and so forth so let's make it happen faster right persistent data is not a huge deal if you're not trying to do a whole lot of magic with it so the nice thing is now we have new new stuff out in the market we have problems that are being solved all the time with persistent data but don't make that your first thing right maybe do the web front-end or the API part and leave your databases where they're at so as we've just heard for the last couple of hours that legacy apps still work in docker microservices isn't another thing that people love to think about converting to and that was a great talk we just had but it's definitely not required I have tons of legacy apps running just fine and containers for my customers and even ones that are potentially 15 years old PHP apps right I had one like that last summer so 12 factor if you're not familiar with it it's basically a set of design principles around how to be how to develop in the modern cloud era with agility and you probably have heard about it you probably are trying to follow it but I just want to bring it up as a it's a horizon it is not a destination right so with 12 factor I see a lot of teams that are putting in these hard requirements around containers and associating that with maybe some of the twelve factor principles when those aren't required day one so when you're when you're trying to follow a twelve factor you're trying to get to there just realize that that is an iterative process for the rest of your life to get your apps to the perfect 12 factor and that maybe we need a little bit of a recovery program for those of us that are 12 factor or Die sort of attitude so let's talk about some things you need to do up front and the first thing I want to talk about is docker files because I think the docker file is really the underpinning of your infrastructure and I would actually prefer you have better docker files than fancy orchestration because the docker file itself it might work in dev and might work okay in CI but once you get the production there are some things that we see that sort of hang people up a little bit so if you're not automatically building service today and a lot of us aren't a lot of us are still sort of manually manually deploying servers and configuring those servers if you're in a smaller shop that maybe is something that's not a big deal for you to automate because you don't do it a lot but once you start docker izing I think of that docker file is your first step in to build documentation because you're now going to be able to see the steps on your server so that's why it's good to put the documentation in there and I love comments and docker files that if you have an ops team that is new maybe you got developers undock and you have an ops team that's new to docker that docker file is one of the first things they're going to see to help them conceptualize how they're going to change their processes and if it's a super fancy docker file with no comments then that's just going to give them more challenges so I love docker files that actually are simple and maybe they're a little more verbose but they get the job done and they work just fine and they're obviously starting from the official images that's a good design goal when you first start out the official images for all the packages on docker hub have been time tested for years now and are very are using all the best practices and you can actually see in those docker files and learn a lot from them if you inspect them so let's go through some anti-patterns because I think with docker files when you're going into production there's specific things with docker files that sort of pop out as issues and it's easier to talk about what not to do than what to do because most of us do a lot of good things in docker files but some of the anti-patterns I see or trapping data and so we have these volume commands that showed up a couple years ago so that's definitely something that you want to use and this is a simple example would be that you have volume commands in your docker file for all your persistent data so databases are easy you know what those database files will probably add but what happens typically as we start practicing our routines of going into production with the ops team and then we're like okay we're going to take down this container and recreate it we're going to put up a new image and then low and behold a lot of times there'll be some other form of data that there oh maybe we want to keep that you know maybe it's uh you know debug files that dumped out of the process when it exited or you know you probably shouldn't have log files to sitting in there and those sorts of things that you maybe don't think about until someone says I want to delete it and then suddenly oh I realize there are things in there that are what we consider backup data but important data that we want to keep just in case so put that in volumes put extra volume commands on your docker files keep it around you may never use that data and you'll just have it as part of your cleanup process but having that in there is a key step I think typically we end up starting to add volume lines as we learn in production where there's actually some potentially valuable sort of temporary or debug or data like that inside of some of the apps so don't confuse this by the way if you're a developer with bind mounts normally that that is a not that you can use that in production but typically we see that where you're mounting the host file system into the container to the directory of your preference normally that is done for development purposes on your local machine in production it's probably a little bit better afraid to use the volume command in the bind mouth not that you can't it's just sort of preferred I think from an ops perspective no version pinning so we all do this I'm guilty of this we all start with the latest we just type in engine X from image and we start from there so I really recommend you start with your from line pinning that version and even on some of the images there's actual dates of the build so you can actually pin to the specific date that it was build on the label or the tag rather so that you get a very specific image and you can control that because once you start iterating in production if you're if you're in a traditional shop where you're not your VM sit around for years potentially you're not used to the idea that you're at your basically your underpinnings are changing constantly it's not just the app changing so if you're changing your code you know you change you don't roll out random versions of your code so don't roll out random versions of your production packages so once you get the from image pinned I would recommend you start looking at your package deployments your package manager commands in there and you pin those as well it's not actually hard to do they all support it and that's key to getting your production reliable so that when you change those image out on your application upgrade you don't suddenly get new packages so a lot of lot of teams end up realizing that their servers have the same packages for the last six month maybe they did security updates but they weren't completely rebuilding servers all the time and this sort of exposes that problem if you start just iterating in production without pinning leaving the default config and what I mean by that is your your PHP files your Apache files your Java memory and your my sequel settings these are all in your docker files or at least they should be and if you have VMs in a traditional architecture or very metal machines those settings might have been changed at some point by someone and not documented if you don't have a very strict build and build process there so what is I'm usually happening is all this stuff works great in dev and CI and then we get to production and we start hitting performance problems and it's usually because those settings aren't actually the defaults are not great for production they work great on your machine but they're not designed for production so you definitely want to do that in your docker file ideally because you're dr. sellers you build up like your build documentation you don't want to just blindly copy in config files I prefer that you actually specify this with all these changes of environment variables in your docker file and your doctor if I might get a little long but that puts all the information in one place and it makes it much easier to understand how you're changing things on the fly when you run them and don't just take your VM configs for like my sequel or WordPress or whatever the settings are on the machine and just blindly copy those over because the file paths may change if you change the from image that you're coming from the file paths are probably going to be different inside that via image than the VM so that might break or the settings I might have been changed by someone in ops on your VM and you might not realize that those settings weren't necessarily documented and you need to put them inside the docker file instead of all the settings in the doctor file so don't just copy blindly I would compare them and see what the differences are from the default in the docker image so environment specific stuff it's not a best practice to take your images and build them per environment so this is actually a bad example thumbs down on that cat and this was a situation where they were building three different images for three different environments and copying in custom settings for that environment at Build time not at run time so ideally you want your images generic enough with same defaults so that you can use that image in dev test and prod and you can change those settings on the run or in the service create command and that way you end up with one single reliable I mean the whole goal here is to reduce the change effect right so if you have one image you can be certain that image is the same in the three environments and you're changing those options at runtime and then when you end up with the fourth and the fifth environments you're not making more images you're just changing config file config values at runtime so let's actually step back and talk about the production infrastructure for a little bit the decision that everyone has to make is do i do containers on VMs or containers on bare metal and we all probably if you've been here long enough in this industry you were on bare metal and then you went to VMs and now you're thinking well great we're now we're gonna have to go back so I would say stick with what you know first I think that was mentioned in the keynote whatever you're doing today it's going to work fine you don't have to force this change into the into your project and do some just generic basic performance testing because not everything is going to be performance equivalent in a container especially when you start stacking up containers in the same OS if you're on VMs you're probably putting one app in that VM when you start doing containers if you choose to start doing multiple containers met VM that's going to change the nature of how the kernel handle scheduling and all the all sorts of other little nuances like the storage drivers that affect the performance of the disk writes writes and reads not necessarily all bad things they're just different so I would I would suggest that you maybe just been a few days learning a little bit about basic doesn't have to be super fancy you have to go buy a product a lot of the store like database storage systems have open source testing tools that are pretty easy to use we actually did some of the stuff at with docker and HPE this quarter actually so if you go to that link there have a the white paper that I helped I did all the actual docker analysis on and it basically was a sample configure how you could do my sequel benchmarking and then compare doing in a VM doing in advance with containers and doing it in bare metal and the pros and cons of each and how that affects CPU scheduling how that affects disk and that sort of stuff so it's no there's no tweets in there that I can pull out but it's basically lessons learned that you can sort of adapt and learn how you might want to do the performance testing yourself all right so Linux distributions they do matter and since docker is a pretty new technology and it really leans heavy on kernel features I'm not going to say it's unstable but it's just there's there's going to be it more issues that are going to arise over your production lifecycle if you're on older editions of Linux kernels so the minimum is I think still 310 and for running docker but it's not necessarily the one you want right so if you're on whatever those distribution you've chosen try to get to the newest version before you go to production because you're probably going to end up with less issues later on once you start getting getting capacity and getting a lot better use out of your hardware you might run into some issues that the kernel has maybe six since then if you don't have an OS decision I'm not necessarily playing favorites here but if you don't have you have no opinion about which distribution to use I'm just going to tell you to use Ubuntu the latest long-term stable release so that you can or lump long term support and that way you have years of support in it they have a nice new kernel on the four four versions and it's been well tested on the Internet so there's lots of people talking about it and you're not going to just end up with an empty Google search if you try to search potential issues you might have so when people just ask me well what would you do I just say well would I would do this because I've done a lot a lot of people that do it and it works well and if you're going to get docker for your distribution don't use the default package manager in your servers I do some that happens a lot I think with new people they just think they'll just do a package manager command for that package manager and install it that's not going to get you the latest version docker iterates quickly now we're on a monthly schedule for the edge release so if you want to be on edge or unstable those are still stables are still quarterly so you're going to need to use Dockers versions of those package managers and you can get that from store docker com they now have all of the distributions listed there that support docker or that have yeah that has package managers for them so you can get them there pretty quickly now once we've once we decide on the hardware layer OS or the ream layer OS let's talk about the distribution inside your images right so we talked about pinning it and I wouldn't make the size of that image a priority in deciding which one you should use I do see a lot of people getting very excited about alpine because it's a smaller distribution that's five Megan size and that's really cool nothing wrong with it it's just don't don't make a decision you do on your first container deployment to production don't make that a requirement of the project because that's going to change all your build documentation because different distributions have different file paths different package managers so stick with what you know stick with the version you're using the addition I guess in of your distribution in your VM stay with that it'll work swarm architectures so there's a lot of best practices happening this week around swarm but I wanted to give you sort of a quick sound bite around what you're going to do when you build out your swarm and what I would call these is good defaults this is like the baseline where you would start from and this is based on a whole lot of things including real-world deployments I've done dockers actual reference documentation and internal testing as well as swarm three k if you haven't heard of that you should look it up - a pretty cool project one of the docker captains actually runs that I think about once a year now and we actually connected 4,500 nodes around the world to the same swarm and then we deployed engine X and WordPress and a lot of fun and we got a lot of analytics out of it about around how to scale swarm and how to deal with it so I'm going to just start off with the world's best network diagram and tell you that this is a real thing let's not pretend that it's not right we all have single instance servers somewhere or some system that if it goes down there's no failsafe it happens right it's not a perfect world and I just wanted to mention that docker swarm and it's still works on a single node and why would you do this why would you have a single node swarm well first off you get new features that you wouldn't get like secrets so you can still use secrets on that server in a single note obviously you got to back it up because it's not redundant but it works if you have to do that if you have a old legacy application that does not perform well across multiple servers so I see this this happens and what I would just suggest do this so that even if ins like CI testing your commands are the same your processes are the same and then once you get to production and swarm you're using the same stuff so the three nodes form is the minimum you're going to get for H a and it's all managers and workers which means each node is actually doing the management of the system as well as running the containers for your application but you can only have one node fail there because after two nodes failed and you can't manage the swarm any more your containers will still run for your app but you can't make changes to the swarm right high availability that I actually recommend as a bare minimum is five and these don't have to be huge instances because swarm managers actually don't they are very efficient right it's all written and go it's all inside the engine so it's very efficient at what it does and for companies that are actually going to rely on this to make money this is the absolute bare minimum I recommend that way one node you take out on maintenance and you're working on maintenance and another node goes down you're still foot you're still fine you got you still have management control and you're good to go ten is not a magic number but after you get past five I would recommend you split out your managers and separate them from the workers so you basically just use some docker swarm commands to split these things out and that way you can now dedicate workers and have more security control over your managers and I'm just going to throw in the anything beyond five that's we want to consider that you don't want to do six or seven managers nothing wrong with that but it's a little bit unnecessary I mean we really need to have three server failure maybe you do but I have not seen a customer yet that I've worked with it had a legitimate use case for seven managers and remember on managers only one is active they're all talking to each other only one is using doing the workload so you're not getting more management capacity that's really just related to your hardware how many apps you have it's not so much related to how many swarm managers you have so constraints and swarm services are how you control where your containers go and it's not actually that hard when you're getting started on swarm you don't have to do a lot of fancy work to get your containers where they need to be so these are some sample commands on how you would add labels to a node that maybe has unique hardware or unique location like in a DMZ so the goal here is to have one swarm not just one only but reduce the number of forms you need to have so you can have different hardware in the same swarm you can have actually different operating systems in the same swarm and you can have them in different network segments so when you would actually deploy your services then you would do a simple little constraint that would basically tell it where to push that container that set of containers in the service and that's what I are my customers use basically is if they have like SSD nodes and standard hard drives or DMZ s and other networks they just set these constraints up like that it's pretty easy so I'm just going to throw out 100 nodes form in case you need to go there you're probably not going to change a whole lot from your v manager scenario you're still going to have them separate it out and the you're going to at this point you're probably complex enough that you have multiple actual like security groups or VLANs and that sort of thing and you're still using constraint and labels to control where things are going so it's not a whole lot that's changed it's probably just your instance size that's changed you probably a bigger instances and you just have more of them so people you ask me like when would I have a new swarm how many swarms do I need you know where you know where are the decision factors around the swarms that I need to have and so we just discussed not such a good reason for having a different swarms are the security groups or subnets or different parts of your network or different hardware configurations those are not good reasons for a news forum so good reasons for a good news forum are geographical differences ideally you really want you to single swarm to be in a same location it doesn't have to be it's not a hard requirement but it does affect performance obviously when you have to send things over the wire so you might want to have your managers as close to your workers as possible if you have security boundaries like PCI that's a pretty reason for obvious reasons personal boundaries personnel boundaries rather so if you have multiple ops teams or you have delegation of authority inside your ops teams and people that need to control swarm at the box the doctor API does not provide you that granular auerbach stuff so you need to either create separate swarms and have you know people with access to each one of those servers or maybe use another product like docker ee which actually gives you the UCP plug-in basically it's a website that allows you to do authentication and granular controls you can get people read out read-only access to your swarm for example and that's a part of that product so externally driven deadlines what I mean about this is if you have a deadline that you did not control which is Austin but it's some arbitrary date in the future that you have to have your container swarm thing all set up and run wonderfully working right maybe it's Christmas shopping season and you have to containerize by then so it turns out that actually the project managers for the Rebel Alliance at Jarvan for had externally driven deadlines called The Death Star that was looming on the horizon right similar you have a project deadline like this you need to make some tougher decisions because you're going to have to probably accelerate your learning path and get things even faster into production so the first thing I see is the not implemented here syndrome Kelsea Heights how are actually tweeted about this recently and gave me the idea this is actually a pretty cool problem that actually keeps happening is that people want to make do it all themselves they want to implement every part of their orchestration themselves they want to implement every feature of their systems as themselves so when I when I see tough deadlines we have to really sort of decide what we're going to cut out of the project we I usually look at what we can potentially outsource I don't necessarily mean you know have it on the internet doesn't have to be there I just mean a prebuilt product that can solve that problem that you're not not having to manage the entire environment and learn the whole product so I use a couple requirements there on that like it's a well-defined product market and there's a lot of good solutions for it and there's a lot of options and also they happen to be a feature of your infrastructure that is easily changeable because that's key you're going to learn a lot and you might not that product forever so for your consideration I would like to give you image registries that's a pretty defined marketplace it's a not a lot of fun to run it yourself nothing wrong with running yourself it's just a lot of extra work when it's already great stuff out there and you're going to deal with image as image storage and image cleanup garbage collection stuff like that TLS and security stuff for your registry so why not just use something else for now maybe later on you can decide to go into production with that log aggregation is the next big one if you have 1705 which is the latest version that's about to come out it's in release client status right now she just came out last week that actually has a new service logs command so that can even get you started without a centralized logging system if you're really if you're a small team and you don't have an existing logging system but you just need to be able to get five nodes in one screen that command can do it it doesn't have a lot of features so it's not going to do a lot of long-term storage doesn't have a fancy search feature to it but it does solve a problem for a basic swarm of getting the logs on the one screen and then monitoring and alerting these are all places where the marketplace has sort of matured and it's pretty easy to throw in a monitoring solution in an hour honestly you just or a logging solution you can you don't I'm talking obviously defaults right but if you have if you using a SAS product that's the whole goal of them right is they're easy to implement an easy replace they don't always do that but that's their goal so you can do this without having to put that as a part of your project and then maybe is a follow-on project you decide to bring that in-house if you don't already have it right so docker products that would actually help accelerate your decisions the middle one here is docker for AWS and docker Fraser actually got this question this morning if you're doing docker on AWS or Azure they actually have existing templates if you just search docker fresh or doctor for AWS you probably will see Dockers home page for that and those are in the store as well and all they are is templates that are using best practices from those cloud providers partnering with Google are partnering with docker and making infrastructure work and actually the Google one is in I think beta right now so this will you don't have to make all the decisions about the infrastructure it helps you choose the OS it helps you choose the network design the security groups that's all done for you so that will accelerate your decisions if you're on those platforms the last one there is docker EE doctor Enterprise Edition you can actually deploy that if you have it or on top of AWS docker for AWS and docker for a sure it'll work with those as well so if you're going in house on your own data center it works that solves a lot of problems that you're going to have later so it will help accelerate your decisions because it's it's implementing those solutions like the layer 7 proxy for HTTP and role based authentication and it's a little reminder as you mature in your infrastructure with these containers your infrastructure is going to iterate similar to your code so your infrastructure is not going to stay the same for three years probably and they had to iterate on the death star because the first one wasn't so great and you're gonna have to do the same thing in your data center so just a reminder that the first form you build is probably not going to be the best one and that's okay it's going to going to get to work and then you're going to be able to create another swarm or make changes to that existing swarm the nice thing about containers and swarm is they're designed around the modularity and being able to replace them very easily so whenever we want to move to a new swarm it's not a whole lot of work it's quite frankly pretty easy there so what if you even need enough further acceleration if you if your deadlines are just so ridiculous that you don't even know how any of this is going to happen so if you already have good infrastructure in place and you're automating today or maybe you have a auto scaling enabled on your VMs or maybe you love the security boundaries of VMs and you want to keep those I'm just going to suggest one container per VM I don't know why we don't talk about this more in the industry as a real solution because it is it's just not maybe the cool sexy thing that we all talk about in conferences it actually would allow you to use your existing infrastructure and processes to get containers in production and it would let you play with docker and production and learn all those things you're going to learn and it's going to simplify your OS builds because now that's shifting from the OS into the docker file it's going to let you sort of replace some of your infrastructure code with docker files and then we're as of this week we now have sort of these projects that are really about this right about changing the kernel packaging that happens so that maybe we can get to a point where our VMs are really efficient and it's not a huge deal in terms of cost and infrastructure to just use one container and a VM in fact this is actually already happening today on Windows the hyper-v containers is a container in a VM one container and Linux is doing the same thing with Intel clear containers where it's one container in a VM these are just reduced kernels and feature sets and we have fancy names for them but that's what they really are so I'm giving you permission to do this as an acceptable practice because it works and it allow you to keep a lot of your infrastructure the same so lastly the other ways you might want to make a change and make it even faster is switching out of what your mindset is around what production is if you're an IT shop that has a significant number of people you might have other opportunities to what I would use call using docker in production internally and I consider production as an ops person anything that someone will complain to me if it goes down it may not be a customer service thing it may be internally but if I'm supporting helpdesk operations and I've seen this before we actually had a scenario where the customer the tech support we're using VMs for these mock environments for the customer's applications and it turned out that they were just web applications so turning them into containers and allowing them to spin those up themselves a little bit of training actually made it a lot easier and it made it allowed us to have a lot of variation so we could have multiple versions on their machines they didn't have disk problem disk capacity problems performance problems also customer demos I had a customer where we had 35 servers per environment so when we would build out a new environment for a new customer it took probably half a day's a whole lot of automation but it still took an engineer probably half a day of work to do that and what was happening is we were getting demo requests ok we got a new customer potentially can we can we set up a new spin up a new demo environment and that would be a half a day work for somebody so we were ready for conduction of production containers and we want to actually change the whole image but we were willing to do that for demos so once we arrived everything we were able to sort of learn and play with production sort of speak because these would be like three day long productions right and then it would go away so it was a really great way to get everybody introduced to docker on the ops side and figure that stuff out and my alarm bells are going off anytime I hear someone say well in order to do that we need to wait for someone to deploy or configure a VM that's right there a ripe opportunity for playing with docker and putting it into practice there even if it's not a production ask type system it'll give you the ops team a workflow to play with that so thanks for coming that's the talk we're going to have questions now but I hope you learn something and I hope you get to docker production soon [Applause] how do we do all right Thank You Brett we're going to take questions now if you have a question please come up to one of these two mics I'm going to start us off with a question so I get this I get this asked a lot let's say I have a legacy application with a database server a application layer and then a front-end layer if I'm pressed for time and I want to containerize something I'm curious what would you contain rice first probably closest to the customer probably the web front-end so my assumption would be easy it was three parts right database API just because that's probably going to be the one that's iterating the quickest and you're gonna get the biggest benefit from that because you're probably not changing your database infrastructure constantly which are probably shipping new versions of your API and web usually on the teams I'm on the web is actually iterating faster than the API so you're going to get the biggest benefit out of it and Web Apps hopefully if they're stateless it's easier to deal with so again it's warm all right a second question so I think I think it's great the the heuristic of just getting something into production putting off decisions that you don't need to make early but obviously there are some decisions you probably want to make up front because it would be costly to make those decisions later keep in your opinion what would be one of those decisions where you should probably put some thought and design into it in the very beginning yeah I think like I said the distribution of your host OS that's a big one a lot of people just use the one they have and then they find out that they weren't actually paying attention to the kernel versions right they didn't actually need to care so much on their VMs maybe they just made that decision four years ago and they're just using the same version so you know like Red Hat for instance I would definitely recommend the latest version of Red Hat so that you get all the latest kernel features out of that for docker and a lot of a lot of shops that use Red Hat they're usually larger enterprises that are slower on the ops side so we usually the first thing we do when we're starting a project is we start checking the versions of their infrastructure so to see if that needs to be because that's usually the longer life cycle maybe they need to buy something there and there's a purchase decision so right I just had a question about how do you go about dock rising something like Cassandra that has its own built-in clustering technology do you swarm with that what you have any kind of direction for best practice to send I don't actually know Cassandra so well or any any technologies that might have its own built-in clustering yeah it doesn't you know rely on something like swarm or gone right well that's going to really going to be used case specific I think the I know that there's some apps out there I think of the one I was thinking of it is similar to that scenario where they they're they were basically tying to solve the problem themselves right because there wasn't all this fancy orchestration and honestly you're going to just have to test it because you know swarm is different than classic form and that's different than kubernetes and they handle things differently so you might like overlay networking is really cool but it also with virtual IPS and stuff like that that might affect how these servers deal with each other if you're if you're going to do swarm and you have situations like that there's actually a DNS a little tweak you can either turn off the virtual IP and do round robin or you can use something tasks dot the service name said right that will give you basically all the IPS for that service so sometimes when you have to figure out all the IPS and that sense of kind situation there's options in swarm that can get you around that so yeah thanks I thank you so much for that great talk I had a question regarding a suggestion you said about running one container per VM of the host we kind of have that scenario but how do you explain the host being underutilized if you are running one container as compared to like if I have an ec2 instance and I'm not using the whole computing stuff read write multiple containers so how do you explain that well that's sizing right I mean you had that problem when you were in a VM and so when you just do the containerization on one container per VM you're not if you if you couldn't deal with that before if you like maybe you run the suit the smallest instance size and you just or you needed something that required a bigger instance size that you were underutilized that's you're not going to solve that problem with with that scenario but what it does do for you is it lets you get started faster so that you're not trying because if you're trying to set that's their number one problem is capacity performance costs then maybe maybe that's not the reason that you're going to end up going into docker production and you don't if you don't do one container per VM what I see a lot is when there's a mature off channel like when they've got all the things figured out and they just basically hit a button and magic happens in the data center those for those teams trying to do multiple sort of like a nested performance problem right get multiple containers and multiple VMs on hosts and so now they have this new paradigm of performance scaling and that becomes a harder challenge for them so my recommendation is you know to delay that decision and just do one container per per VM so for use it for that specific problem yeah that's not going to help okay tanga yeah go hide over there so we're in a traditional VM world today and our VMs are immutable infrastructure we start the instance up and it knows how to go check out from git config or from get all of the configuration files like the MySQL config let's say moving to docker you're saying to put that inside the docker file what how does that change if we're using a mutable infrastructure now now we're taking things that are today and get repository moving them out of get into the docker file right does that boy does that still make sense to do that lease where we haven't evolved practice and we're doing things in version control right so is it the ops team that controls the the kiddos can fix today and the developers have the code we repo and the ops has the the other repos access to say yeah so I actually had that kind of challenge about a year ago with the project and honestly the docker file is a shared responsibility it's it's neither ones property it's it's an agreement on both on the contract so you might end up with like like in a small organization the doctor files just in the repo with the code right in a large organization like that you might end up with a shared responsibility repository that has the agreements for those docker files and then like for the developers they would end up having what it would look like for them it's like on our local machine they would have to download all their repos and then they would download this docker file repo that would ingest basically work with that code and then you in production you you would be using those docker files you just in your CI and CD platforms you do would have to get the docker file over to the code and go them together so that's probably a simplistic example okay um but yeah I mean at some point you guys are going to agree on the docker file whether or not you keep the sequel configurations in something separate and then just copy them in instead of putting them in the actual docker file you're still going to deal with a docker file problem so yeah all right thank you oh hey Britt hi so I got three questions oh I'm going to monetize a lot of things so like can you tell me if there's any latency requirements between I guess managers and workers there is no requirement now there's no hard requirement if it's two if you don't like how fast it is then that's probably your requirement okay it sounds silly but we we did the whole swarm things form 3k over the entire world and different regions and it worked was it fast enough for that off steam you know I don't know there's no I mean that's all the latency the latency between managers as between themselves is kind of more important because you're talking about a consensus algorithm that is constantly every second checking all the servers but you know it's small amounts of small size packets but it's a lot of constant packets so the more latency between your managers the slower they're going to be able to make decisions about who's the leader and who's the you know the followers that sort of thing about it the next question is I don't want to start a holy warrior but like do you see a lot of doctor swarming implementations based on VMs or their mettle and is there any sort of performance metrics said maybe docker has done yeah so HP not the white paper that I mentioned it was specifically to my sequel and it so it does talk about that with my sequel and that would probably be similar and other open-source database technologies on comparing the X it compares all three just VM without containers that there's another HPE white paper shoot me an email it's on that website you can get that Iman website should be an email and there is another HPE white paper that specifically is only about performance of containers and VMs versus containers in bare metal the the CliffsNotes there is bare metal is awesome I mean if you can get there it's like a win in all situations because now you have less to manage the performance is amazing especially depending on how many workloads you're putting on that OS and we do a lot of containers you can really see the scheduler do a lot of cool things I don't see a ton of people going bare metal because like we don't have a lot of cloud providers that do a joint does it but it's on smart OS so that might not might not work for you and the other ones have it but it's not a very popular option so I don't think this it's because bare metal bad it's just people have spent the last 15 years making VMs and creating all this infrastructure around it and now we're you know we're going to go back it's going to take us a while so so unless question so like in the demo earlier they're talking about I guess or going what not I'm like so how do you scale services like ortho like just vertically scaling or like you know can you do any sort of horizontal scaling with multiple clusters and whatnot yeah well that doesn't I wouldn't say that docker changes that that's going to be limited on the application and like when I did a little one node swarm thing that mean that happens a lot because even if you're on a swarm as a lot of apps that they've got some sort of persistence thing or they've got you know session state issues and they can't be spread out they have to be on one node and that's just like an application doctor is not always going to really help with that even with forms so I mean it might help automatically recover on a different note if that node fails and a swarm would help with that but I'm not really an Oracle guy so I would say whatever the Oracle existing cluster technology is you could probably do that I mean I got it in the store so it must work on swarm but yeah it's up to the app okay so we got time for a couple of more questions and then we'll end the session just a reminder for everyone if you really like the session in the talk and you have a friend who wants to see it later please vote for it on the app we're doing replay sessions all day tomorrow and it would really be awesome if we could hear Brett talk some more tomorrow so two more questions and then we're good yeah thanks thank you we i scenario we have to private registries one effectively for our dev environment the other for our production environment is there a way to do replication between those automatically let me accent me tell you the reason why yeah so does anyone know the answer that question because I don't know I've never heard of it specifically we have a we have a decision point where we're a decision maker determines whether or not images is going to get pushed out to the production reap or not right so that's why I was wondering if there was some ability right I mean honestly like replication this guy's talking about Mary I'm not sure what that is but yeah okay okay so it sounds like there's something in registry that that can help with that the other thing I would say is docker trusted registry a part of the docker ee it has a new caching feature so if like you had one master needed to geographically distribute the images it can now do that with caches and then also if I had to do it and I didn't know about this I would probably just set up some sort of automatic scripting automation that would yeah pull them and push them push them to pull them tag them and push them really exactly what you're doing today ok ok ok sorry hi thanks a lot for the talk um so the recommendation that you have about kernels and stuff we really got bit by that we use send to a seven the default kernel and that's e17 the container we had was doing a lot of right and we kept seeing memory out of memory errors and it took like weeks to basically figure out just upgrade the kernel and everything just work it was thanks for sharing that there that happens yeah it would have been awesome if I just attended this talk in the past well yeah because I mean you know Linux has gotten stable enough in the last decade that a lot of times whatever you have is like Apache works on all versions right so that's that's the thing about docker and you know now when you get into swarming you get into the overlay networking and you're using IPSec and IP virtual IP addresses and all this stuff that's just using more kernel features and more kernel drivers and I'm sure there's issues there that I'm not even aware of and if you just search the forms for kernel you end up with a lot of stuff and not the forms with the github issues so yeah so you upgraded the the center wasn't it solved a problem yeah and we switched to overlay - so my question part of the appli is beta storage driver yeah storage driver do you have any recommendation between a ufs overlay and overlay - I have started using overlay - as of like I think like three months ago it's also now that the followed by the way on doctor for Mac and docker for Windows if you have an older version from the summer it started with a ufs and now if you wiped it clean and reinstalled it actually does overlay - that's not really an indication of what you should do in production but it's I think an indication of how docker is that's where they're going and that's where they I think they see the future is an overlay to so that that's on RedHat you're going to do device mapper and not overlay two because that's the official if you're doing official Red Hat that's what they invest time in and spend time on so yeah yeah we're leech is only available in kernels for plus yeah another yeah yeah that's another thing true yeah I think a lot done yeah thank you they go back I think that's uh that's all the time we have for questions if you have a questions for Brett please come up to the front and ask him yourself I think he's free and you will not be eating lunch if he has a lot of questions everyone thanks for coming I think it was a great talk come to the next session here if you want to learn about troubleshooting tips from a docker support engineer thank you
Info
Channel: Docker
Views: 10,066
Rating: 5 out of 5
Keywords: docker, containers, Using Docker
Id: ZdUcKtg84T8
Channel Id: undefined
Length: 49min 36sec (2976 seconds)
Published: Mon May 08 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.