UCMS '13 - Continuous Deployment with Ansible

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

thanks for having me my name is Tim I'm from ansible works I run the services and support organization there and is the volume okay it's a little booming to my ears but alright so I'm going to give you guys a quick overview of what ansible is we'll have a demo of some of the interesting functionality and then hopefully we'll have some time for for questions if you folks have questions so first of all how many people have heard of ansible and know generally what it is okay so I've got kind of a kind of a fresh crowd here ansible is an open source tool it's a it's an IT automation orchestration and configuration management system that draws a lot of ideas from from other from older tools and brings brings a lot of things together concepts such as as orchestration which which when we say orchestration we mean manipulating an entire tier of application or an entire application stack which may be made up of multiple tiers in a coordinated way so as opposed to sort of a wone system at a time approach to configuration and and automation of tasks ansible can coordinate those operations throughout the system and we'll show you an example of this as well ansible can talk to anything that has an API or exposes an SSH interface so you can use ansible to talk to your networking devices your load balancers firewall switches and so on as well as your your typical UNIX or Linux machines and simple doesn't do windows yet but now that I know Bruce hopefully we can make that happen in a way that makes sense so I'm not going to go into detail about what continuous deployment is with this crowd I probably don't need to for the purposes of this discussion I like to think of continuous deployment as as frequent application updates frequent unattended application updates the ability to maybe even several times a day or seven times several times an hour update your your application in the field from the code committed to rolled out to your customers and and to do that to get to a continuous deployment environment I think there are some characteristics of the tools that you choose that are really important number one a tool should be easy to use dealing with your with your deployment tools and your configuration tools should not be a full-time job on its own it should be that the tool should work for you it should be easy to get started to do basic things and it should be possible to do complicated things without becoming you know an expert in in any particular tool it should be non intrusive so the tool should work with with your workflow the way that you've decided to deploy your application and it shouldn't force you into certain layers of abstraction that may not make sense for your for your application or for your environment or just the way that you think a tool that you use for continuous deployment should should be predictable you should be able to to trust the results there should there shouldn't be any sort of undefined behavior or you know ordering ordering weirdness that that might interrupt your your your flow of work and ideally a tool like this would be powerful and be able to handle most of the tasks that you that you give it you you shouldn't have to necessarily bring to bear a lot of different tools and duct tape them together continuous configuration management orchestration and general automation tasks deployment of applications are are all really closely related and and it'd be nice to have a tool that does all three or four depending on how you count all of those tasks in one tool so you might not be too surprised that ansible satisfies in in my opinion most of these requirements or all these are the requirements some of the things that are different about ansible compared to other configuration management deployment and orchestration tools ansible is very simple we like to say you can automate in plain English that's not really true it's not really plain English but it as close as we can get it's a it's a really a human-readable format and I'll show you some examples of the PlayBook format that we've developed to to really to express your configurations to express your automation tasks there's minimal jargon in the system we try to keep the the language pretty clear and keep the keep the terminology straightforward without introducing a lot of specific sort of jargon e words and terms to to describe what's going on when you're writing an ansible configuration you're not writing code there are very few sort of traditional language constructs it is a it is a DSL a domain-specific language the playbook format but you're not you're not writing code you're not writing loops and and so on and if you if you get to the point where you feel like you want to we've we've built some mechanisms to to help you get away from that and get back to sort of the the pragmatic approach of describing your infrastructure as data and we've taken a really pragmatic approach to abstractions one thing that some people when they first encounter ansible they they wonder you know how do you how do you abstract away this idea or abstract away this idea to two different levels and and we really try and encourage people not to do that and and just you know build your configurations in a way that's it's easy to read and easy to understand for someone who's looking at it six months down the line or someone who doesn't necessarily know ansible and we think we've we've pretty well achieved that goal ansible is secure not because we're security experts by any means but because we have explicitly decided not to become a security experts and we rely entirely on on SSH communication by default so basically we piggyback ansible piggybacks on your existing SSH infrastructure for connecting to the remote hosts under management open SSH is probably if not the most well reviewed it probably is the most well reviewed piece of crypto software out there and if there if there are any problems if there are any any vulnerabilities those will be closed very quickly and the same can't be said for something something like ansible which doesn't have quite as many eyes on it so we we gained the the benefit of all of that there's no custom public key key infrastructure necessary to use ansible and there's no agent running on the managed servers this is a big big difference between ansible and some other tools ansible is completely agentless we push and I'll talk a little bit more about this in the architecture overview but we push a small module out to the remote host per task and it runs and then deletes itself and return or returns the result and deletes itself which means implementing ansible gives you no additional tax attacks surface for your managed machines this also means that we can coexist with it with other configuration tools we're not going to conflict with a puppet agent or ax or a chef agent or anything like that and you can use you can use ansible in conjunction with those or other tools would be you know BMC blade logic or or ops ops ware or whatever and I mentioned this previously but but ansible covers more of the more automation tasks than other tools so ansible was designed from the very beginning to be able to handle not only the the declarative configuration management where you say on this class of systems these files must exist and these packages must be installed at this version and an ansible kit you can express those kind of declarative ideas in ansible and ansible will make it so but you can also very simply define steps of tasks and just you know basically take a shell script and translate it to ansible and and and this might be a good way to do your software deployment say check out your code from get copy it to the server restart the service that's sort of sort of that sort of deployment type of task is very easy to accomplish with ansible as well you don't have to translate that into you know the item potent resource model declarative state type of idea you can mix and match the two which is pretty powerful and along with that you can address tiers of machines as a group and you can you can do it in a coordinated sense and I'll show you an example of this where you manage a load balancer in a monitoring system during a rolling up upgrade of a software application so here's an overview of the architecture ansible runs on a on a machine somewhere this is your management machine there's a couple exceptions to that I'll talk about that in a minute ansible runs on the management machine it has an inventory of your systems and the most basic form of inventory is just a flat file of host names in groups and subgroups with associated variables there's other places to get inventory there are a bunch of modules ansible has about 120 modules we kind of take the batteries included approach kind of in a Python sense so if someone contributes a module and it's generally useful in high quality code we will bring it in to me into ansible and we'll test it and we'll maintain it so there's modules from everything from my sequel to OpenStack quantum and so on and so forth so you've got your playbook which operates against things in your inventory when you run a playbook or when you run an ad hoc command which is just a way to issue a command across a set of servers ansible will connect via SSH copy the module over the module will run the results will be returned and ansible will move on there's another mechanism of connection so there's actually the connection mechanism is pluggable the default is SSH but we have a 0 mq based mechanism which is about it's about 10 times faster than SSH so if you have you know thousands of hosts and you're operating and you're hitting them over and over again with with various various configuration tasks you'll probably see some benefit from switching to the 0 mq mechanism we do a little bootstrap to install the 0 and Q it doesn't remain on the system it actually times out after a certain period of time so we kind of maintain the agentless state that way as well in our experience though most people don't need to use the zero MQ mechanism for most people most tasks you're not you're not hitting all of your thousands of hosts over and over again in quick succession if you are then you might need to use 0 mq most people use SSH we can also talk to networking devices as well we can manage load balancers there are modules for f5 netscaler nginx and and so on in the system and of course it wouldn't be 2013 without mentioning cloud we can talk to cloud instances just as easily as any other in fact one of the benefits of the SSH connection is that most cloud instances whether they're ec2 or eucalyptus or OpenStack have the default connection mechanism is is an SSH key and it's injected into the instance when it boots ansible can take that as that private key and and connect right away without any other bootstrapping ansible is about a year and a half old it's an open source project it's fairly young we've had about 13 releases I think and we've we're seeing some some really great adoption from the community we've we have we have almost 200 unique contributors which is which i think is pretty impressive for a project of its age and I think one of the reasons why we have so many is because it's very easy to contribute to the system if you're just building a small module it's a it's pretty easy to get up to speed and you can build a module for your task in a couple of hours and then hopefully that will be useful for other people as well we we've seen a great uptake that the fedora infrastructure project is is using ansible for all their configuration tasks among a bunch of others and if you want to know more call me afterwards I won't I won't go into too much detail here I want to stay fairly technical but we're pretty excited about the about the interest that we've seen in the system this is our sort of batteries included list there will be a quiz at the end so so pay attention we've got modules from everything from basic file and command execution tasks to specifics like RabbitMQ management Django management a bunch of OpenStack stuff is in there and these these modules are what gives ansible the kind of the the power of the playbook language so you can write a playbook that's that only uses one module maybe the command module to execute remote commands but a lot of the power and the expressiveness comes from these modules that are built into the system and allow you to do - to do useful things like manager my sequel databases and so on from just a couple of lines of playbook content some of the most common modules that you'll see of course are package management so one of the things you almost always due to a system when it boots up is you install certain packages on it so of course there are modules for for yum and apt and and Arch Linux and Mac ports and some some FreeBSD stuff you can you can specify that a package must be installed or must be absent you can at you can ask for certain version numbers of packages if you have specific requirements so that's kind of that declarative sense of package management there and then the less declarative stuff the the command execution you've got command and shell command just runs a command on the remote server via SSH shell actually passes it to the shell so you can use you know shell constructs and so forth service management is for an init script management restart services start services etc and of course file handling you need to copy files to your servers those those modules are there and and these are these are also more declarative copy and template where you say this file must exist and you and ansible will make it so if not if the file is already there and doesn't need to be changed then ansible will will not change anything and then for for deployment one of the common patterns that we've seen is people deploying directly from from your source control so we've got modules for the the various popular source control management tools out there so here's an example of a very simple host inventory and when I get into the demo a little bit I will I'll show you I'll show you a real life example this is super simple you've got three hosts they're in two groups you can target your web servers or you can target your database servers with a with an operation or you can target all the hosts now in a real environment a flat text file of inventory is not going to be very scalable so we've got other inventory sources you can source from from your ec2 cloud and they'll be automatically group based on tag based on availability zone etc and we have similar modules for Rackspace and OpenStack if you're if you're using cobbler we can source inventory from there and it's pretty easy to write a custom connector to another kind of CMDB if you have something something that we don't have built-in support for so let's talk a little bit about playbooks I'm going to show you a demo soon and that'll that'll probably it'll be useful to understand a few of these things you've got a playbook a playbook is the the top level descriptor for your automation content a playbook is is made up of plays plays target an individual set of hosts so you might have a play for your web servers you'd have a play for your database servers you'd have a play for your load balancers and you could have a common play that applied to everything each play has a list of tasks the tasks call modules and the everything happens in in strict a strict order the way you specified in your playlist so there's no there's no sort of you know undefined dependency ordering or anything if you need if you do to express the idea of dependencies you just have to turn it into into a step-by-step list there's one exception to the to the step by step idea that's the idea of handlers if you're if you're going through a list of tasks and you need to trigger something say say you're configuring Apache and you need to change something in the main the main Apache configuration file and then add a virtual host directive to another file using the declarative mechanisms in ansible you would need to somehow signal that Apache needs to be restarted if the files changed so what you can do is you can set up you can trigger a handler on each of the tasks that might modify the config file if the config file does not need to be modified nothing will be changed and nothing will be will be restarted but if one of those tasks or both of those tasks actually does change a config file then the handler will file it fire at the end of the play restarting Apache so you're not going to get duplicate restarts or anything like that and this is really the this is really the only drug and you have to learn the this we've intentionally kept the the language really simple so that you know you can talk about this to people who may not understand ansible and you can show them play books and even a non-technical person should in theory be able to look at this playbook and kind of understand what's going on and and and and grasp the idea and you can kind of see here an example hopefully it's I guess it's pretty readable so this is a full playbook it so it only has one play and the play has two tasks so first we have the name of the of the play then we we tell you we tell ansible which hosts it's targeting so we're just targeting all hosts in the real world this would probably be you know hosts web servers to target your web servers and then the tasks here are also named these are free form names so you can you can say install httpd and then you call the yum module you give it a package name or list of packages and then you you ask ansible what state you want those packages to be in do you want them present or do you want them absent or do you want them on the latest version or do you want them on these specific versions all of that stuff is you can express that in the playbook and then we start the Apache service with the service module and again it's a declarative state so you say name equals the service name equals httpd and the state we want it to be in is running and if it needs to be changed ansible will make it so ansible is a very extensible system as I mentioned before you can build custom inventory sources you can construct callbacks if you need to to do some sort of sophisticated logging or logging to a centralized location and you can also develop your own connection plugins so if you have a device that maybe has a different kind of API or doesn't handle SSH you could build a connection plug into to talk to that device we have a couple of connection plugins out there of course there's the SSH plug-in there's our the SSH connection type there's the zero MQ fireball mode and we also can talk to a to root it's not exactly what's happening but but you can manage at your root or you can manage it your you can construct a playbook to to work within at root that is so let me go through a really quick example and then I'll actually show it to you this is a common model it's it's not complicated it's something that everybody who runs a web application does some people do it by locking their team in a room and and and going through the steps manually some people have this this automated using custom scripts but the idea is you've got a multi-tiered web application you have your load balancer and your monitoring system and you have your software living somewhere you've got app servers web servers and not shown is the database server which is usually present so the first thing you're going to do in the ansible world is execute the playbook and and what I'll show you in a couple minutes live is is the is actually ansible can signal the load balancer to say you know say we have 10 web servers we're going to take two at a time or one at a time or five at a time we're going to take whatever that serial number you've chosen take those first two out of the load balancing pool by signaling the load balancer tell the monitoring system we're taking these down for maintenance so that nobody gets paged unless necessary and then we're going to apply the update to the to the app servers or the web servers or or both we'll move on to the to the next tier and once those two servers are done ansible we'll go back to the load balancer and back to the monitoring server and undo the changes so put them back in the pool assuming the tests pass and then move on to the next two in the next two in the next two until the entire system is updated so this is not an uncommon task ansible was designed to make modeling this task in an automated sense very easy so let's see it in real life I've got a handful of VMs here hopefully I can get my let's see beautiful okay so I've got a handful of VMs here I have I'll just show you the inventory file here what I'm going to do is I'm going to comment out the second web server I've got a web server a database server a load balancer which is H a proxy and then I've got a Nagios machine running these so these these VMs right now are all bare CentOS installs the only thing I've done is set up an SSH key connection so I don't type the password every time so I've got a playbook here that is this on this is on isn't it I've got a playbook here that looks like this it's it's pretty simple it's calling calling into two roles that I've developed that that you know have a common role which sets up NTP and configures my software repositories and so on I've got a role for the database server so this top-level playbook is not very interesting all it's doing is breaking down the hosts in my inventory and applying roles to them so let me run this and this will take a few minutes to go through and while it happens I'll talk about what's happening you see this gathering facts right here if you're if you're a puppet user or know something of a puppet you might recognize the term facts facts are just gathered information about the remote system so ansible goes out and gathers the the facts that is a little hard to read isn't it but what this says here is changed and yellow it's changed green is okay changed means of course ants will change something on the target host green means nothing has changed so if my internet connection is fast enough and I should really set up a proxy on my laptop to make this quicker in just a couple of minutes we will get a fully installed Nagios h a proxy apache etc stack ready to go and I'll be able to show it to you and then the next thing I'll be able to do let me just open up a new terminal here hopefully that's visible what I'll be able to show you is the idea of a rolling upgrade so this is going to be a little tough there we go so this is how we would Express a rolling upgrade using ansible it's only a few lines of playbook content but if you can see here at the at the top at the top four for each and see we're targeting the web servers and we're using a serial of one this means we're going to target one machine at a time we're going to complete the entire process for web server 1 then move on to 2 ordinarily in a large environment you'd set that to 5 or 10 or you know however many you wanted to target at once so what we're going to do is for each web server we're going to disable Nagios for that host we're going to disable the server in H a proxy and you can see we've got a Nagios module that just says action disable alert so we give it the host name and then we we delegate to all of the monitoring servers that you have in your environment and then we move on to the we move on to the load balancer to disable that particular server and then there's a couple different ways to approach the next step but in this case we simply reapply the existing sort of encapsulated roles to those to the to the web servers that we're talking to and this will update the application to the version that you specify after those rules have been reapplied and that'll also up that'll also apply any other configuration changes as well so if you have modified the list of packages that you need installed on each of your systems that will also apply to the rolling upgrade let's just check this over here so we're installing my sequel once the once we've reapplied the role which means applying the update we undo the changes we did we made to the to the load bouncer and to the monitoring server so we re-enable the server and H I proxy and we re enable the Nagios alerts so that if the system goes down unexpectedly you get a pay there's other things you can do here one thing that a lot of people like to do is put a little test case in before we re enable the server so if for some reason the upgrade failed we can we can test and we can bail and not go on and propagate that bad change to the rest of your servers which is very easy to do in an automated environment okay so this is just about done let me show you what one of the playbooks looks like inside the rolls we'll go ahead and look at let's see let's look at the base of patch eat well we've already seen that basically look at the web application that we're deploying this is a this is sort of a fragment of a play or a fragment of a playbook it's just a list of tasks and hopefully it's starting to look a little bit familiar we've got the we're calling the yum module to install some packages we're calling the SELinux bulan module to do some SELinux configuration I usually just turn SELinux off but well this supports supports both and then we clone a repository from from get into the correct directory so it's a trivial deployment this can be extended to to be more complicated most deployments are more complicated than this but it could be a good starting point all right so we're still waiting on PHP and git however I think why don't we just be brave and see if the web see if the load bouncer is responding let's see nope not yet all right we'll give it another minute or two let's see ansible has the concept of variables and variables can come from a lot of different places the the most basic place you can get variables from is your group variables and these are these are variable files that are applied to named groups in your inventory so if we take a look at the web servers here we can see that we are defining a an interface on which the web server should listen to which which repository we should get our web app web application code from and then the the repository hat or that the the the the commit hash of the the version that we want come on SELinux were so close there we go alright so you can see we're actually deploying we're actually deploying the code to the to the web server now we're working on the load balancers I apologize I really should have had that proxy setup so I wouldn't be waiting for the Downloads let's see all right we probably have a minute left anyone requests any questions in the meantime we can get ahead of that maybe yes switch on the front my wife is pretty loud like it do when I come up and use this mic yeah there we go you actually did not specify the network plug in in the PlayBook files so it takes it default I guess SSH right yes yes it defaults to SSH if I have to actually specify then is there an option for that I may let's say if I'm using any other country networking plug-in for that yeah so if you want to use a different connection plug-in you can actually just say ansible - playbook I think it's C something like that you can specify it on the command line you can also specify it in the in the play so this becomes important if you're dealing with you know local actions which are kind of a you know slightly more advanced than this talk but if you're doing a local action you're running something locally on the ansible machine for some other purpose so maybe you're calling an API to launch a cloud server and then you're going to be operating on that cloud server so you can there's a special kind of connection plug-in called local which just runs locally and you can specify that you almost always specify that in the play because you've got a specific reason to do that but you can override the connection on the command line as well if you if you know what you're doing Thanks other questions while we wait for wait for the repository ok let's see if our load bouncer is up there we go so we have a basic application all it is is a file that serves a version and it's it's fairly fairly simple I want to show you Nagios nah gos is not installed yet but what I will show you next is actually a way to apply the up to apply an upgrade to those to those web servers in a rolling fashion and I really wish nah gos would install because I'm impatient so the rolling upgrade will look like this so what we're saying here is we are that's a little hard to read but what we're saying is run the rolling upgrade playbook and we're overriding a variable that lives in the system so if you recall back to that group variable file that I showed you we had a repository hash in there and I could put in a specific repository hash in that file or I could put it on the command line in this case I'm just going to I'm going to go ahead and specify to pull the head of the repository come on Nagios you can do it I'm not going to get brave and run both of these playbooks concurrently cuz I don't really know what's going to happen what I am going to do because I only have ten minutes left and I want time for questions I'm going to remove the monitoring host entirely and run the upgrade so we're here at the hitting the load bouncer it's redirecting us to one of the web servers version 5 is the the version of the code and we're going to go ahead and run the rolling upgrade now this is by necessity going to skip the actual step of talking to the monitoring server but what it will do is it will talk to the load balancer take the affected server out of the pool upgrade the application and put the thing back in the pool and you can see it's actually going through the all of the configuration steps but for web server but none of them are going to change except for the actual code deployment because we've already done the things like inst set up Apple and install the Nagios plugins and so on and so forth and this is this is an example of an Sable's idempotency where you can run the same play book over and over again and if you've if you've written a playbook in such a way that it is idempotent nothing is going to change you don't have to you can write a playbook that just runs commands and those commands will run but you've got ways to to make it to give it that I'm idempotency speed of the network aside I'm impressed that you guys have conference wide access here good work when use next to gigahertz yeah all right so we've applied the upgrade and now we're on version six nothing too exciting it's a simple thing people do this all day every day but ansible gives you a really quick and easy way to to model this kind of thing and it can be extended to to much more sophisticated deployment tasks a couple of examples one of our customers app dynamics is a very quickly growing application performance management company based in San Francisco they're growing very very quickly they're deploying their application to their customers or into their data center environment on over you know 250 machines many times a day and they're using ansible to do that and they're expecting it to to scale to to whatever extent they need need it to scale which you know they're not slowing down anytime soon Gawker Media the guys behind the Jalopnik blog and Gawker and Lifehacker I think they're also using ansible to deploy their their their content management application many times a day basically triggered from commits to their source control so this is this is how we tie back to the continuous deployment so somebody commits some code it goes to a test Kay a test suite maybe it has a physical reviewer to sign off on it in the case of app dynamics they're using Garrett to do that approval as soon as that code is ready it's it's deployed to the system completely automatically and it saves these companies many many many many man-hours and man days of work to to to streamline their deployment and get that get new features out to their customers faster all right so whoops just messed up my slide hang on a second I want to put this up here quickly any questions about five minutes left if the git module doesn't do everything that is needed special rev list commands or sub module configuration is it better to modify the module use the shell command or create a new module good question I think we would love to see a pull request to improve the the git module if there are things that it doesn't do that that are sort of normal get procedures we would absolutely bring those back into core if you're drastically changing the behavior if you're breaking backwards compatibility backwards compatibility is important to us if you're doing that we may you know may suggest a separate module if you're doing something that's probably not going to be useful to anyone else I would probably suggest just doing a set of commands using the command module but if it's something you think other people would use and it doesn't break backwards compatibility we would very much love to see a pull request on on github for that more questions great thanks I've got some stickers and business cards up here if you're interested thanks guys you

Info

Channel: USENIX

Views: 83,128

Rating: 4.814672 out of 5

Keywords: usenix, ucms, ansible, continuous deployment, continuous integration, configuration management

Id: PDRdCqFp2sY

Channel Id: undefined

Length: 38min 37sec (2317 seconds)

Published: Thu Jul 11 2013