Continuous Integration with Jenkins on Amazon EC2 [1 / 5]

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi my name is Jeff Schatz and in this video I'll take you through the process of installing and configuring a Jenkins continuous integration server on Amazon ec2 and we'll do it all from scratch now I will assume for the most part that if you're watching this video you probably already know about continuous integration and therefore I won't spend a whole lot of time extolling its virtues but just in case you are new to continuous integration I'll just take a few minutes to briefly describe it so to understand the need for continuous integration we'll first imagine a scenario that's actually fairly common in software development we have a few developers working on a project each of them goes off and implements individually a few classes so they code them up they write all sorts of unit tests to ensure that their classes work properly and at the end of it all they're patting themselves on the back thinking that their program is going to be very very robust because their classes are so well tested but then they go to integrate the classes they combine them together to create the larger program that the team is developing and suddenly everything breaks code fails to compile all sorts of bugs arise and this situation has come to be known as integration hell and it's essentially the time near the end of the project when we've finished all of our individual components and now we're trying to integrate them together and suddenly nothing works now integration hell is a real problem for software projects and it's very risky to their success the reason being that while we may be able to come up with reasonable estimates of how long it will take to code up this class or code up that component it's a lot harder to estimate how long it's going to take to fix things that go wrong in other words it's hard to estimate in the face of uncertainty we have to start by figuring out well what's wrong in the first place and that can be very hard to do in a large project and then once we've diagnosed the problems we have to spend time fixing them so we may end up wildly going over our allotted budget or our schedule and it can be very risky to the success of our project so somewhere along the way the folks that came up with the extreme programming methodology suddenly realized that hey so many software projects are failing due to integration hell maybe a better way would be to integrate continuously throughout the entire and they therefore came up with the practice of continuous integration or CI so rather than waiting until all the components are developed and then spending weeks or even months working at the bugs that will inevitably arise when components are integrated on a project CI says that we must integrate early and we must integrate often and indeed it actually goes so far to say that we must integrate upon every single change so every time a developer pushes new changes to the repository then ideally the project code should be checked out it should be built in other words compiled and all of its tests should be run and this way a team can ensure that the components being developed for a project interoperate correctly and the team doesn't have this cloud of uncertainty looming over its head knowing that it's going to need to spend weeks or possibly even months at the end of the project getting its components to integrate properly now of course if one is to rebuild a project and run all of its tests each time changes are pushed to the project repository then the entire process is going to need to be automated and this is exactly the purpose of a continuous integration server so typically the process will work something like this a developer will push changes into the repository so maybe a github repository for example and somehow the CI server will be notified of those changes it will either pull the repository every few minutes to see if any changes have been pushed or the repository will actively call the CI server to let it know when a change is available so whatever the case when the CI server sees that changes have been pushed to the repository it will clone the repository on the server or it may actually instruct a set of build servers to check out all of the code and then build the project and if the project fails to build it will send a notification to the team usually via email but assuming that the project does build it will run all of the projects tests and if any tests fail it will then again send a notification to the team and then finally the CI server will usually generate various reports such as the line coverage of the project's tests or maybe a report on style issues that were found in the project's code of course developers should be running tests on their code before they check it into the project repository because we never want to be that guy who breaks the build so why do we need a CI server if we're running our own tests well first of all a developer might just forget to run the project's tests before checking in his or her changes on the other hand the CI server will never forget so any problems in the code will still be caught even if our developers forget to run their tests secondly on even a modestly sized project to run all of the tests in the project could take anywhere from say 30 minutes so maybe all night long many many hours and therefore requiring a developer to run all of the project's tests before checking in changes is often going to be infeasible so it's useful to have a separate system on which all of the tests can be run and that system of course is the continuous integration server I wrote a web project recently that had about 15,000 lines of code not a huge project but reasonably sized so about 15,000 lines of code but it then also had another 15,000 lines of test code and to run all of the tests in the project would generally take around 30 to 45 minutes if I was running them on my macbook so it just wasn't really feasible to run them on my own system instead I let my CI server do it for me and then I could continue working on my code while it ran the tests and when it was finished I would get an email letting me know whether or not the tests passed if we were developing a multi-platform or multi architecture program we'll also need to test our program in all sorts of different environments windows on Intel 32-bit Windows Intel 64-bit Linux Intel 32-bit Linux arm 64-bit and so on so we can set up all of these different platforms and architectures that we need and then the CI server can use them as build slaves so if the CI server detects a change in the repository it will instruct all of the build slaves to check out the latest version of our code build it run all of the tests and then report back to it now as we already discussed beyond simply building and testing a project a CI server is useful for generating reports about the state of the code so we often want to keep track of Metro select the projects line coverage which is the percentage of lines in a project source code that have been executed by its test suite line coverage is by no means a perfect metric but a high line coverage will generally indicate to us that a program has been tested thoroughly and we can therefore be reasonably confident that it is free of defects so having the CI server run the tests and subsequently generate reports on metrics like line coverage can provide very valuable insights to the team and maybe also identify areas in which further testing might be required maybe it identify as a class where the line coverage is only 50 percent and so we know that we need to go into that class and maybe write some more tests for it beyond test metrics there are all sorts of other utilities that we often run on our code there's a program called check style that we can configure to check our code to ensure that it adheres to these style conventions that we've adopted for a project another example is a utility called findbugs which performs what we call static analysis on our code to try and identify bugs in the code essentially what it does is look for bug patterns pieces of code that are likely to be errors and then it generates reports on those areas so that we can go in and examine them so all of these sorts of utilities can be run automatically by the CI server every time we check in changes to the repository which is going to help us to ensure that we're maintaining very high quality code at all times finally we can also deploy artifacts so for instance on the web project I just mentioned every time I would check in code to the repository github would notify my Jenkins server to let it know my my CI server and my CI server would then check out my code build it and run all of the tests in the project and if the tests passed then it would deploy my code out to a staging server in other words a server that was set up to be similar to my production server but it was an environment that was always running the latest version of the site's code so that my beta testers could go on the site and they could play around with the latest features and give me feedback on them similarly we might be working on a desktop application and if all of the tests pass we might want the latest version of the program to be uploaded to our companies website so that our customers can always get the latest stable build with a CI server we can accomplish this sort of thing very very easily and in a totally automated fashion so in this video we'll see how to set up a continuous integration server on an Amazon ec2 instance or in other words a virtual machine you may already know that Amazon operates the largest public cloud in the world known as the Amazon Elastic Compute cloud or simply Amazon ec2 and it's a quick inexpensive way to get a server up and running ec2 allows us to fire up an instance a virtual machine a virtual server in the cloud and we can do whatever we want with it you can put up a website in a few minutes you could set up a mail server and of course we can set up a continuous integration server now there are a lot of CI servers out there Hudson cruise-control teamcity and so on we're going to use Jenkins which is one of the most popular CI servers out there it's a very active project it's updated often and it's also free and open-source its interface might not be the prettiest but it provides a whole lot of functionality right out of the box and it also has plenty of plugins that you can install to further extend its functionality now one thing you may notice if you look up documentation on Jenkins on the Internet you may see a lot of references to Hudson Jenkins originated from the Hudson project it was forked from that project after Oracle took control of Hudson and the development team began to worry about how well Oracle would manage the project so they forked the project and they released it under the name Jenkins so in this video we will set up a Jenkins continuous integration server on Amazon ec2 we'll start out by setting up an Amazon ec2 instance or again in other words a virtual machine in the cloud and once we have the ec2 instance set up will install Jenkins on it and then set things up so that github will notify our server every time a change is checked into our project repository Jenkins will then check out the latest version of our project code build the project run all of its tests and then we'll have it generate line coverage reports so that we know how well tested our code is of course if any failures occur when building or testing the project we will configure Jenkins to email to let us know about them now for the purpose of this video I'll be using a Java project in conjunction with the maven build automation system but you could use Jenkins to build just about any type of project as long as your build and testing process is automated now before we get started there are two things you'll need in order to follow along with this video first you'll need an Amazon Web Services account from time to time I get email from people on the internet asking me if I can create them a free Amazon Web Services account and unfortunately that's just not possible running Amazon ec2 instances cost money so it's not possible for me to provide accounts to people other than my own students Amazon has very generously given me a grant to use for a course that I'm teaching but the funds in this grant are specifically earmarked for my students only so you will need your own account and I cannot create one for you I'm sorry about that the good news is that Amazon offers a free tier which essentially means that you can run an Amazon ec2 instance for an entire year and not pay a single cent so if you're interested in taking advantage of that you can visit AWS Amazon comm slash free the other thing you'll need is the host name of an SMTP server so that we can configure Jenkins to send email if your SMTP server requires a username and password which most SMTP servers will these days then you'll want to have that information handy as well so assuming that we have these two items let's get started
Info
Channel: Jeff Shantz
Views: 302,467
Rating: 4.9094076 out of 5
Keywords: continuous integration, ci, jenkins, continuous integration server, amazon ec2, ec2, elastic compute cloud, maven, java, github, git, software quality, junit, unit tests, unit testing, build automation, jacoco, Amazon Elastic Compute Cloud (Software)
Id: 1JSOGJQAhtE
Channel Id: undefined
Length: 12min 37sec (757 seconds)
Published: Mon Feb 17 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.