Hello everyone. My name is Ashish Nanotkar. So today's session objectives are we will be learning on why DevOps, why to use it in the first place? What is it? What are the market trends? About different tools or
the job trends as well. Most of the people here would
be interested in knowing if I switch from my current job profile to another job profile,
would it be beneficial? Should I even invest in DevOps my money is one thing but time, valuable brains that you will be wracking? So should we even be doing it? Another one is the skills required for you to be a good DevOps Engineer. I will help you to
understand what is required and when it is required. You might not meet all
the skills all the time but it is always good
to be an all arounder rather than getting stuck
on some of the points. Most important is going to be the DevOps delivery pipeline. That's why we are here. So we will be talking a lot about DevOps delivery pipeline. What are the different components? What are the section,
the tools that form it? Concepts that we deliver
using that pipeline. So in different companies or
even with the same company but different projects
it would be different. So people would be using different tools, they would be building stuff differently. They might be using branching, they might not be using it. They might be using
Jenkins or some other one. They would be testing
product a different way. Someone might like to do with JUnit, another one since it is a
JUnit product might like to do it with Selenium or
something like that. Stuff varies so it's not a silver bullet that we are building here. I'm here to help you
understand the concepts and then we'll be talking about what the problem statement is. We will be taking up one problem statement and then that problem
statement will be continuing until the end. So I'll talk about a
problem that this is what we're going to solve in
the next seven classes so that you have a clear
picture of where you're heading plus the stuff that you
want to keep in between or the steps that you
want to build in between. So you can come up with your own plan. Every DevOps Engineer
who is responsible for building a solution is
in control of himself. So whatever he says that
I want these, these, these steps to be taken to reach
there, it's totally fine and it would be correct most of the time. Parts can be different. You can think a different way, I can think a different way, someone else would be coming
up with a different solution. It's totally fine and we appreciate that. So anything that you build by end of this complete course would be evaluated and you would be given
a practice certificate based on that. Okay, why do we want it? Why should we learn it in the first place? DevOps is a new word but
DevOps is not a new thing. People generally mistake
DevOps with automation. They get mistaken when they say I will be using Puppet or
Chef, so it's not DevOps. Like I told you earlier,
it's like a cultural shift that you want to work on. If your company is doing cultural shift pretty much awesome,
everyone is happy with that. So they can do DevOps only using
shell scripts if they want. I've seen people do very good automation using just their shell scripts for deployments, so it works. But here we will be
standardizing ourselves because we don't want
to re-invent the wheel. Whenever you want to automate something you would like to find patterns. If there is no pattern
you can not automate it, that's the problem. So even with artificial intelligence when you say it's an
artificial intelligence, robots can think. No, they can not. They are fed with some artificial as in some patterns of thinking and they keep on mixing and matching it
so that's the algorithm. Here in DevOps as well
we automate based on a couple of patterns and let's
see what those patterns are. So you must have learned
about waterfall model back in your college days. So what is a waterfall model, it includes a couple of steps that
everyone has to follow in order to get the software out. It's the staged pattern. So what happens is you build your software in various different
stages and those stages are then followed by yet another stage. So output of one stage is
acting as input for another one. But your whole software moves through it. Instead of moving the complete software through the stages, people
came up with the brilliant idea of implementing Agile. Fine, so Agile is good. What Agile allows us to do
is to keep your software in an always updatable
and deployable state. It also means that you
need to test your software multiple times. It also means that your
software has to be built in incremental fashion,
one chunk at a time, not a complete software
can be built over night. A couple of lines of code,
test it, deploy it, check it. But that comes with its own problems. You need to do the steps
multiple times a day. So development is fine,
developers are paid for that, they develop all day long. But testers, they need
to test the same features a thousand times a day
whenever it is resolved along with incremental features, along with the regression lines. So it is not possible,
we need to automate that. At the same time when
you are moving the code at that pace you need
to make sure that people are not polluting your code base, they are following the standards. So in hurry, they should
not be overlooking something so there needs to be a
standard solution for that. All right, that's the
reason we have brought in automation tool to help with everything. And when I'm saying they
should not be overlooking it, that's the cultural part. They should not be overlooking it. It's not a process that
we're keeping there. What's the difference between
a process and a culture? Many of you must be developers
so when you develop something you write some code and then
you check in your code in Git. I'll talk about what Git is, what version control
systems are but for now let's say you want to move
the code from your machine outside to the world, to the public. So when you do that you
first need to test it on your machine, then
you need to commit it. Or if you're in a bigger organization and your code is critical,
you should get it reviewed from someone else. Hey dude, I've written this,
can you please check it for me? It's not like a teacher kind of check, he's not giving you marks but you just have another
pair of eyes look at your code to figure out something that
you have probably overlooked. So that has to be there. And it's not on a checklist
so nobody gives you a checklist that you
need to write the code, then you need test it,
you need to build it, then you need to package
it, then you need to review, again test it again, then commit it. No, nobody gives you a checklist. That comes as a culture from you. That's why people bring in
employees who are already experienced because you bring
in some culture with you. So that culture can not be forced but it can be put online in the pipeline and that can be made mandatory using the tools so that's where we bring in these tools. So we say before you
commit I want to test it. Or let's say if you have committed before moving it to the master
branch I need to test it. So who is going to do it? There is a solution for that. So they have created a nice bot. That bot makes sure that
whenever a check-in happens we automatically test it. And to run the tests automatically there has to be some tests, you can not do everything manually. So that's another cultural
change that we want to make that manual testing has to go away slowly. We want to automate everything so there has to be a test case, there has to be a code
coverage statistics. When you bring something
from human to machines, human can say I have
tested the code completely. When he says completely,
does he provide you with the statistics, how many percent? No. When it goes to a machine
and a machine can not tell you how much code has been tested so it has to be calculated. So there is some statistics
according to that and in DevOps we do talk numbers a lot. Since you are moving from
manual stuff to automated stuff we talk numbers. We say earlier your code
was having let's say 50 percent of code coverage
as in the test cases that you have written, they only test 50 percent of your code. And then we have written some more tests, some more scenarios are covered and when you execute your program
all of the lines of code are actually executed so
your code coverage increases from 50 percent to let's
say 95 percent, 96 percent. That's the statistics that we talk of and every single step
that we build in DevOps is a statistical step. So that's good and that's
good to remember as well that we should be having
statistics at each and every point in the DevOps pipeline
or development pipeline or build and release pipeline,
whatever you call it. So here what we're going to do is we're moving away from waterfall, we are embracing Agile but Agile has its own problems, that's why we are trying to remove the redundancy
and support the culture by automating it using the tools. Since this culture is very much
common all across the globe, so whenever you see an IT company, pretty much everyone tests their code before putting it out to production. So if they do so we take up a concept. The pattern is to test the
code, that's the pattern. You test it using any tool you want or simply write a script
for it if you don't want to use an automated tool,
if it is too big for you. But that concept has to be there. So build, test, deploy, and again redirect so you'd be redirecting again. At every level there has
to be a feedback loop. What is a feedback loop? So see here in the stages, requirement gathering, system
design, implementation, testing, deployment, and maintenance. If you see here, the output of one is going into the input of second one. If second step, let's say
on the implementation phase someone figures out that
the design wasn't proper, that's why we are stuck
at this particular level, they will shout back
and this shouting back is not quick enough. Generally, when waterfall
model was being used different companies or
rather different teams were blocking the picture. So testing team was completely different, it had nothing to do with development. They never knew who the developer was and the complete software
was given to them by a different company to test
and after a couple of weeks, couple of months, they
came up with a big report that hey, these things are not
working, you need to fix it. All right, that's fine
but during that span, a couple of months later, the developers who were
actually working on that code, they forgot what they were working on. And that literally happens. Do you remember what
you ate two nights back? No, you don't. So how do we fix this? We don't allow any developer
to change the context. And changing the context as
in how long does a person remain in the same context? A couple of minutes,
let's say writing a code, 400 lines long code, and I've tested it, it works for me because I am
testing just the one feature and I am pushing it away. So when I'm pushing it off,
there needs to be a software that tells me that something
is wrong with the code. Instead of testing just
one feature that I did, it will be testing all
the features, all of them. See if I'm breaking up someone else's code or someone else's feature,
then that testing engine would tell me. And if we test it so fast,
in a couple of minutes it would reward me back. It would say, hey man, you have
broken this particular code. It works for your feature but someone else is actually crying so I
don't allow you to move it to the master branch. So fair enough, I'm still
working on my laptop. I'm happy to receive an
email in a couple of minutes and I know what I was
doing so I'll fix it. That's soon enough, right? That's called a feedback
loop and that feedback loop, right now I've just told
you about the testing part but it can also be at
the deployment stage, it can also be after the deployment stage, the monitoring stages. It would just not be a notification to me but it could also be a
notification to the system. So we would be having a system which will automatically scale up and scale down, depending on the traffic. So let's say your traffic
is 80 percent and more. Okay, what do I mean by that? I mean that your CPU
levels are actually surging high to 80 percent on all of the machines so that would trigger a notification. That triggered notification would add a couple of more machines to
your network automatically. So it's rough managing a
run book for ourselves. A run book is just a couple
of steps that one needs to do when a particular scenario
is generally observed by the ops guys but you would
have a run book automated to carry on these steps. So there are a lot of
stuff but if you see here we're just supporting a
different kind of culture. That's why DevOps is also quoted as a complementary part of Agile. So if you're Agile and
you don't have DevOps, you can not implement Agile
in a fully efficient manner. If you have just DevOps
and you're not Agile then DevOps would not help you. If you automate everything
but still work along with the waterfall model then
what's the use case of it? Why do you use so much
automation in the first case when you are not automating anything? So it needs to go hand in hand. So perfect use case would
be a project to which changes are coming in
very frequent and you need to deploy it to production
very frequently. Very frequently as in a
couple of times a day. If it is at least one
times a day it's fine. But if you deploy with
a gap of three days, a week, two weeks, no, it's not DevOps. So the whole idea of doing this automation is to remove the pain
points from the ops side which are actually
registering the changes. So what these people do is,
it's a fair idea actually, something is working just fine. Why would you want to change it? Something is working just fine, why would you want to change it? Let's say I have tap in my home which is working fine from ages. So it's working fine, it's no problem. Other taps have been
changed much frequently but this one is sturdy one,
it doesn't break, it's cool. So if it is working why do I change it? Fair question, right. For the ops guys as well, the code that was working initially, if it is working fine without any errors, why do I change it? And it's not just about code. When I change the code I need to change the rate of the schema. When I change the rate of the schema, I also need to make sure
that the logging is proper, I need to make sure that the
queries are not impacted, the response times are good. I need to make sure that the new code that I've dropped in
is properly set up and if it is not for the same machine I need to bring in other machines. So this whole drill takes
a lot of preparation for the ops guys. That's the reason why they registered. They have got a lot of
things for different sections so for Ubuntu they won't
be having different, for Centuis they would be having some different support team. For databases different support team. And since it is a
completely different team it's a silo so they need to go through all the approval levels. Shoot an email that hey buddy,
I need to deploy my code next month on second of
Jan. so be ready for it. They would say no, no, we
are having some different work to be carried out during that window. So it's a lot of work and
it's a big event for them. We don't want this big
event to be reoccurring. What do we do? We automate it. So once you automate
it and you are doing it hundreds of times a day,
it's not a biggie for you. But how do you do the same thing hundreds of times a day
reliably without failing? And having everything automated
right from the deployment to rollback in the worst case scenarios? That's what the thing is. There's a question here from Mr. Basu. The question is, can you give more example of run book in a development scenario? Yeah, why not? See in development scenarios,
when you build something you would like to rework. If it fails you would like to rework. You could think of something but actually in a development environment
I wouldn't recommend you to use a run book there. If it is a container and
you want just to test a part from the code development, a part from the code development
if you want to test it locally on your machine, if it goes right you commit it automatically. If it doesn't go right you keep the container and inspect it. That can be an example
of run book for you. It can automatically be
implemented using a script. I hope that helps. But the actual coding, I
wouldn't recommend you to do anything automatically on your code. So there has been a lot of
option in DevOps after 2009, the latest development
being in 2014 onwards. Sorry for another question here. The question is from Swaswagun. Is JUnit a run book? No, I wouldn't say it's a run book, it's just a test framework. A run book is a scenario
that I could build. It's just a scenario,
it's not for testing. It's a scenario that you revisit
if something has happened, the problem if the
problem statement arises, you go to the run book, figure
out where is the problem, and what are the steps to fix it. That's what the definition
of a run book is, okay, so JUnit is not run book. All right, so from 2014
onwards there has been a great exponential curve, like a J-shaped here in DevOps options. There's a lot of traction that
has been seen in the market. That is the reason why a
lot of people are moving from non-DevOps world to DevOps world, moving their profiles. Okay, it seems to be cool but it has been in picture since long. Has anybody tried to
search for DevOps jobs at Google or Facebook or Netflix? You wouldn't find one because
there is no DevOps there. That is one level of DevOps. So there would not be any DevOps when you reach the final stage, that's what I told you. That is the level. So once you automate everything,
your culture is proper, you don't need a DevOps team any more. So automated ops is not DevOps. If you use a tool to
automate your ops thingy, it's not DevOps any more. It has to be on the both sides. So these days people are realizing it. Actually Agile is also. There's a new concept called model Agile, which is based on just four points of the 12-point manifesto which
was raised earlier. It happened in late 2014s, early 2015s, almost two years now, one and
a half year to be precise. But it's fine, if it
works for other people and they are adopting it
slow, it's totally okay if it works for them. So now we need to support that transition, hence we are learning
what we are learning here. Okay, there's a question here from Tenish. The question is we have a
release and deployment cycle which is monthly and quarterly. Can DevOps help here? Usually when module is deployed at times we miss some components. Also at times a few
months after deployment some of the components are missing in some of the client machines. Can DevOps help here? Sure, it can. You should be adopting
that any time sooner so DevOps can help you do this. Instead of deploying and
releasing it monthly, you could make it more frequent. You could create a
small client or a script which would make sure that
every related component and dependencies are pulled before to configure something on
your clients' boxes. It can be automated and it should be automated in the first place. I don't know what your product is, what the modules are, what
the technicalities of it is. You are the best person to
figure out the solution for that, but yes, you could do that. Question is from Hema that I'm working as a virtualization admin. How can DevOps help me
to enhance my career? Since you're already in virtualization, you are in a part where
you will be nimble enough to provide multiple machines
or involvements of people. Now when we look at the CD part of it, the CD part of the pipeline then we need a lot of infrastructure to be built. That infrastructure, since you are a virtualization admin you
know there will be different components apart from what just virtual machines to virtualization. Now all of these would provide
you with an API as well. That API can be programmed so once you program
everything using that API you could put it in a nice bot, something like Jenkins
that will help you out. So instead of you doing all
the virtualization work, Jenkins will be doing it on your behalf so automate that. That can be automated. You must be doing it right now as well but not in the DevOps
fashion but you can do it once you finish up this
course you'll see how Jenkins will be doing stuff for you, you can move it on. I hope that helps you. But as a career, it's also to move outside of just virtualization because it's just one of the small chunks of DevOps thingy. A company DevOps Engineer or
end-to-end DevOps Engineer would be the one who knows
how to write the code, who knows how to test it,
who knows how to build it, how to package it, how to test it again, then deploy it somewhere,
again do some testing there, along with the statistics. Then move it on to production,
create the infrastructure, configure it properly,
and make everything that you have done, apart
from coding, redundant. You could get past any of
the catastrophic affects, let's say your Jenkins blew up. Instead of shouting who did it, you should be able to bring
it back up in minutes. One of the machine on your
production systems died, you should be able to get it back, not from the backup. So it's like do something
which will help you to prepare yourself in failures. So if something fails
you should be able to get it back without shouting. That's what the end goal of it is. So you need to learn a lot of things to be an end-to-end DevOps Engineer. Plus you need to have an experience about different architectures, different deployment strategies, different solutions for
different business problems because not all the businesses are same, everyone likes to run their
business their own way. So the solution has to be
crafted according to that. So it's always a good idea to know a little bit of scripting, some coding, that's always helpful because tools wouldn't always help you. Not all the tools would
fit in your organization. Let's say my organization
would be a smaller organization let's say so I'm concerned for
that and we are not running more than 15 machines in production so would I recommend them
to use Qbana for their controlling all the containers? I would not because it
is overhead to manage just Qbana and four on other machines, adding almost 40 percent
to the budgeted value of the infrastructure. It doesn't work for me. I wouldn't recommend using
Puppet or Chef, it's too bulkier, too heavier, to manage it
in high availability systems so I would not recommend that. I would have the tools,
I would switch the tools. Concept being the same, I
would use something else so you get an idea. So that's the end-to-end
DevOps Engineer for you. If you are low on a couple
of skills I would say just help yourself to that, ramp up. I'm here for another one
month so it's a good time and if you want to practice it literally at least spare two hours
a day for next 30 days. So that's a good amount of
time for you to practice a lot of stuff which
has been covered here. Apart from the slides, you could also jump onto any other tool you want and bring in the questions
after the class, it's fine. Question from Mr. Wayne Cutter. The question is, how feedback loop is used in an organization. Okay, there will be different
feedback loops, a lot. The more you create, the
more you will be able to tell what's happening in your CICD pipeline. So first feedback loop happens
right at the commit level. If you're committing
something and your code is not sanitized, you
are missing a semicolon, that commit will itself
check that the code is a faulty code and it wouldn't run so it wouldn't allow you to commit. That is the first feedback
loop and just the smallest one. You get to know about
that particular problem right within a second. That's the first one. You get it wrong if the code moves through the commit section it would
check for the coding standards. So if your organization
needs a space between closing parenthesis and
opening parenthesis, when you don't pass a
parameter you have missed it, it will tell you back. Now this time it would
email you or it would IM you or send you a direct message,
that's another feedback loop that you could create. Feedback loops can be of different types, the commit level, at the testing level, then at the deployment phases. After deployment, if the
performance is not right it is not within the brackets
that you have set up, it would tell you that hey
buddy, after this change the performance actually
decreased by 20 percent and you just have a tolerance level of three or four percent. So it would tell you. Nobody thinks of it, right. When you deploy the code
online it moves to production and nobody actually figures out that after a particular feature has been deployed how much extra load is being
generated on the machine. On the production how much
response time has been decreased by that or increased by that? What is the impact of it? No one tests it apart
from really, really nice big companies where they
don't have any DevOps jobs. You should be doing it. So at different layers
there's a feedback loop. On the infrastructure side also if one of the machine dies or your logs aren't getting logged, there
is a network condition, so there are sensors. I wouldn't call it a
little physical sensor. It's just a software program which tests the state of your network,
the state of your machine, the state of your disc, the RAM, the CPU, different boxes and it will send you some logs back, some notifications back. But based on that
notifications there's a bot that takes care of notifying you. So that bot does some
calculations, some math, and it sends you back a notifications. So any computer system has got zero IQ, you know that right? So you need to write a
program for yourself. If you don't want to write it, search for someone else's program. So the tools that we
are learning right now, they have been evolved as a solution to some of the problems. And since we find ourselves
at the same locations, same position that I'm also
facing similar kind of problem so let's see if the
solution fits in for me. And most of the cases it does. That's the reason why
we are using these tools like Jenkins or Puppet or
Chef because we find ourselves on a similar side of
the problem statement. I hope that answers the question. Question here from Nagendra that once automation sequence is stated by QA and handed over to DevOps they will continue to run the automation. Then what is the role of QA
in this DevOps lifecycle? Yeah, good, this is the most
mistaken scenario of all. People generally think that
testing has little to do with it but I will say testing
is most important part of any CICD pipeline. The first reason why you create CICD is to automate testing. It's never been handed off
to any of the ops guys. DevOps is not setup by ops guys so it's never been that so far. So once your test suite is
created that has to be automated. So creation of test suite has
to be done by QA, correct? But when you create it
it has also to be run. It can not be run just
once before deployment, it has to be executed
every time someone commits in DevOps or in Agile
world because you need to test the complete code. Testing or running the test suite is not a part of QA's life. The reason for that is
because we don't want QA to be just running the test suites. It's just wasting a lot of
time and sitting there idle, doing nothing. So it is handed off to your machines. So DevOps pipeline makes
sure it is executed. And testing is supposed to
be done at different levels. So you've got unit testing,
you've got functional testing, you deploy the code then you
have your integration testing to be done. After the code is deployed, it
moves on to the staging area where it is combined with other modules, third party services. You again do a regression test there, make sure that it is
performance validated properly and then it moves on. So QA would be a really busy
person writing down tests for each and every new
feature that is being released in your Agile cycle. So their role comes at
various different levels. Any time you choose to hop
a level with your code, let's say you are moving from DEV to QA and from QA to Stage, that
level has to be sanitized, it has to be tested. And who's going to test it? Of course the QA, right. So I hope that answers the question. QA has got a bigger role to play in this complete DevOps lifecycle. Question from Frenima Salu. The question is how to help
DevOps at UNIX admin level. At UNIX admin level, see right now you are working in a silo, that's why you are
asking me this question. The best part would be your
company allows ops guys to talk to DEV guys and
when they talk to DEV guys they learn how the development
is happening there, what are the problems that they're facing, automating the stuff and
get the code continuously rather than getting it at the end. If you want to automate
things at admin level, because you're not talking
about deployment and stuff, you could really automate
everything you have right from configuring
your development machines, creating their user accounts,
setting up dial up boxes or setting up the
firewalls and getting stuff routed in and out. But you can do it in a reproducible manner so instead of coding
on the machine directly or instead of configuring
the machine directly, you write some scripts for yourself or use some configuration management tool like Ansible or Puppet or
Chef, which can help you address the problem in
a nicer, recursive way. So let's say your firewall machine blew up and how much time do
you think you would need to reconfigure everything? A couple of hours, right. Wouldn't that be easier if you have already configured something,
configured a software to create that firewall system for you? It would do it in minutes. If you blew away your LDAP system, you should have a backup which
would create the users again, configure their lap
machine just like it was plus reconfigure the
other machines which were actually pointing to that. A lot of things that you could do. So configuration management is something that will really help your life. Does that answer your question? Question from Eric. The question is what all tools do we need to create a complete cycle like building, testing, deploying, and stuff? Yes, that's coming in the next slide so I will not jump ahead of me. I'll cover it. This is another question. The question is from Nagauzhina that I'm just a beginner, completed
my Master's recently. Can I get through this by a month? It depends on you. For me, if you ask me if I
were in your position, yes. That's fairly easy. That totally depends on
how much time you put in. I would recommend everyone to at least put two hours of time daily
for the stuff that we are doing plus anything else that you want to learn and come with questions. So I really recommend you to do so. Okay, now that we have
talked about the work formal, the problems that there and we also will be looking at what we are trying to solve here. On the cultural part these are, the green things that you
see will be telling you the pointers plus the red ones are the problems that we need to solve. So huge waiting time for code deployments. There's a lot of pressure
to work on the old code, the pending ones, and the new features at the same time helping developers switch between the things and
at the same time test it to solve it. Maintaining the uptime of
the production environment is a bigger problem. If you fix it for the ops guys,
they would be really happy to deploy your code as
many times as you want without worrying about the CAB and all these kinds of just routines. Difficult to diagnose and
provide feedback on the product. That's difficult, you
need to be in a position to monitor everything. There's a bigger problem than monitoring. If you monitor everything you get a log. A lot of logs been done. You also need to be in a
position to analyze those things and tell whether the system is working in the proper state or not. And how do you analyze it? It's not just a couple of
seconds back kind of scenario, it's about a couple of weeks back. So let's say a couple
of weeks back I see that the traffic level on my
machines writes on Monday, steadily goes down on Tuesday, Wednesday, Thursday, Friday, and drops to zero on Saturdays and Sundays. That's what my traffic
stream looks like but how would I know about it? I need some analytics
engine which will tell me by a nice visualization graph that dude, this is how your graph looks like. And if this week I've seen the graph drop to just 50 percent of
traffic right on Tuesday I would smell something fishy. But how would I know that? Because on the monitoring front end it tells me that all the
machines are up and running, everything is fine, the
application is running cool but I still don't know what is the problem where the traffic went down to 50 percent. But in order to figure out why, you need to first know that it happened. So that's a bigger problem here, you need to have an analytics engine. That's why people have they have came with various different solutions. So we have got a solution like Elk Stack, Elastic Surge, Lock Session, Qbana. That trial is a really
awesome in open source world. We have got Splunk, which is
a non-open source solution but it is really awesome,
you should have a look at it. There are some peered and
hosted solutions like New Relic out there which will help
you do the job just fine. It totally depends on which one you use but you need to have an analytics engine to give you near real-time,
if not real-time, statistics on your current application. Question from Hema. After this course, will
I be able to get a job as DevOps Engineer? Well, no course can help
you to be what you are, it can only guide you through the things. What you need to do is to
be done all by yourself. So this is not a certification that you will be getting a job somewhere. For getting a job you need to work and whatever is being covered here matches to the job profile of most of the job openings out there. So you know a CICD pipeline,
you know a solution, you know the concepts, you know the tools, then you can apply for the job. But getting experience on it
is totally a matter of time and the more you work because if you work let's say twice as
hard, you should be able to chase your dreams twice as fast. And I've seen people
who have been in DevOps for almost four or five years
but they don't know anything. So leaving that part, answer
for your question is yes, if you work. Question from Michelle
that, please suggest most basic tools to
start with for a person with no background in ops or testing. Don't worry, this course is all about doing that from scratch. We don't have any prerequisites. I don't recommend you come
from a specified admin or testing or development background, no. If you are a fresh
starter, it's still good. We talk about the tools. There's this question from Kumar that I think that that's where
IDM products come in handy. Yeah, they do. Question from Vankert
that as a DevOps Engineer, one has to focus both on
development and operations? Ideal world, yes, but in your company if they are hiding from automated ops, sorry for that, a lot of people do that. They are hiding for automated ops, that's not DevOps again. They would want you to be
specialized in just one, either Jenkins or Puppet or
they would want you to be working on the cloud part just specialized in one of the cloud providers like AWS. You don't even need to code anything, you don't even need to
learn about Jenkins, you don't even need to know
what the application looks like. That's not DevOps, that's bad. But they will pay you handsomely for that so as a straight answer
to your question, no. But if you really want
to be a DevOps Engineer that's a must, you should. Cool, so there are a couple of problems that we have figured out
and how do we fix it. These problems are very generic and these problems will be available
in most of the industries who are working for IT. So DevOps would help you in
automating an infrastructure and it would help you in
automating your workflows. Projects will be having
different kind of workflows. And talking about numbers,
so continuously measuring the application performance. Not just performance,
but also code coverage, the level of testing that you are doing, statistics about how much
stress the application can take, how many users, how many concurrent users, how many connections
your database can handle, the amount of RAM that is required, the footprint that is
there, a lot of stuff. You would get a lot of statistics. We barely have almost 500 machines or so and we generate a daily log
of almost 55 to 60 GB a day. For analysis, that's a lot I think. When you talk about big data
it's not even considered big data, by the way. That's not big data for
us but it's a lot of logs that you need to analyze. And analyze 50 GB of text files, it's not again a binary,
it's a text files. You know a couple of
MBs of text file can be difficult to open on a text editor so how do you analyze those many logs? You need something to
do that magic for you. So the challenge is like waiting
time for code deployment, automate that. Pressure of working on old
and pending and new code at the same time, use
branching, use a smarter version control system. Why are you still stuck with
something like Subversion? Move ahead. So what's the DevOps solution for that? Continuous integration,
that's the first section of the CICD pipeline. We call it a CICD. C stands for continuous,
I stands for integration, again, in CDC stands for integration, C stands for continuous and
D stands for deployment. There is again something called delivery. Slight difference but people tend to merge delivery with deployment. It's fine, it's not a
hardware definition anyway. I'll come to that when we talk about CICD. So here once you set up the CICD pipeline, continuously integrating the
code with the existing one to make sure that the
existing one doesn't break, that's the whole idea of CI, and that is done automatically
using some of the tools. So that is the solution for that. For working on different
stuff at the same time use matching strategies, use a better version control system. We'll talk about one which is a better version control system called Git. We might set it up or we
might use the whole solution, depends on how much time we got. So tomorrow will be a day
when you'll be learning about Jenkins and Git. Managing the uptime. If you want uptime, jump
back to virtualization. Why do you want to spin
it up on a hard machine which is very tough to
procure then set it up? If your machine dies for any reason you'll have to literally bring
it out of the data center and then push another one. So we make sure that
we use virtualization. So scrubbing of a virtual
machine is very easy. So you could create one, scrub it off, still have a backup of it
on your storage device. You could run it from the backup again. So it's fast, it's reproducing. That is the reason why cloud
providers have moved from your dedicated hardware
to a virtual hardware. So they would provide
you with virtual machine rather than giving you an actual machine. They call it as an instance,
that's virtualization for you. So right now we are dealing
with EC2 virtualization, as in the virtual machine
virtualization on AWS. The newer generation right now three, four years later we would still be looking at virtual machines
but at just absolute way. A virtual machine will be created only for setups like let's say something that you don't want to change a
lot, a buffer server maybe. Some machines that need a lot of resources and can not be moved
from one box to anther. Something like let's say a web server or something like that. World will be moving from
your virtual machines to containers and the
traction has already begun. So people are moving from
their existing VM architecture to the continuing architecture
and deploying the code there. It's really fast and
trust me when I say fast, it's blazing fast. If you take 10 minutes
to spin up your machine and another 10 minutes to
get it and deploy stuff on it with containers you would do
that in a matter of seconds. We talk about containers
a lot so that you are ready for the future and of course for the current times as well
when people are transitioning from non-container to
container environments. Tools to automate
infrastructure management are not effective. Okay, we'll make it effective. We'll see that every
tool integrates properly with the other one and if it doesn't you should know a little bit
of coding to make it happen. In the software world
everything is possible so there just need to be a talent which will make it possible. So configuration management is something that helps you in integrating
different sections of your infrastructure together, reconfiguring it again and
again, even if it fails, reliably and every time. So we would be learning about
configuration management. The tool that you're
going to learn is Puppet. In CI section we are going to learn a tool which is called Jenkins. For branding strategies
we are going to learn Git. The number of servers to
be monitored increases so we need to have a solution
which is easily scalable. Client server kind of architectures are generally very scalable
because you don't need a server to be interacting with clients, client will be sending data back. Okay, so we just need
to worry about server. So we will be working
with Nagios to understand how monitoring works on larger scale. And it's difficult to diagnose
and provide a feedback on the product. So for getting the ready feedbacks again we need a monitoring. So with monitoring we also need analysis. Nagios will not provide
you with real-time analysis but still it gives you a fair
idea of what's happening. People generally don't use Nagios for application monitoring,
they would use something like any analytics engine that you have. Maybe Alcos plan for that
purpose of firing the queries and generating
notifications based on that. But you will see how Nagios
can be used for that part. So I'll show you how to create
your own plug-ins for Nagios so that you could adapt Nagios
to your own requirements. And it's easy, it's very easy. All of this stuff,
installing and setting up any of the tool. Although I will be covering it, you can find all the material online. Do not rely on any course material which is provided just by in any classes. So don't rely on that because
the world moves very fast. Everyday you'd be having
a new version released. Everyday you'd be having a
couple of more articles to read. You need to keep yourself
updated and DevOps is one such field where if you're outdated for three months you lose
a bigger release out. You will lose your updated state so you need to be continuously updated. Question here from Sidar
that will you cover batch management and
reversal of code check-in on regression some time
during one of the sessions? Are we also going to
consider pushing the code on multiple branches and its
challenges during the CI phase? I will make it a point. I'll remember this, okay. But I can not guarantee
that I will be able to finish it in time. Maybe after the class
we can discuss about it. After tomorrow's session
when we cover Git, if not in class we'll
definitely cover it after, at least talk about what the problems are, I'll cover that. Question from Shiv, why
Puppet has been selected instead of Chef and
Ansible because the latter is easier to understand. I would say for learning curve
Puppet is the toughest of all and once you know the
concepts you can easily implement that on Ansible
or SourceTec or Chef. The documentation part, if
you ask me my own favorite, I would go with the easiest first. So the documentations for
Ansible are the easiest, then comes SourceTec, then comes Chef and at last I would go to Puppet. So if you want to learn
Chef it is really easy. You can learn it in a day. Ansible, even easier, a couple of hours. SourceTec, I would compare it with Chef and Puppet is the toughest because even if the documentation are fairly simple but it is a lot lengthier to
understand the very first time. So we'll be taking up Puppet
and if you are interested to match it up with Chef and Ansible I will also do that
during the Puppet classes. So we were talking about DevOps and answering a couple of queries. The question here from VJ
that he's from IVR background and can he fit in DevOps. Yes, you can but for learning anything new and to try anything new,
you will have to leave your comfort zone. You can not take your
comfort zone with you and try to fit it into
one of the sections. I hope that answers the
question pretty much straight. Question from Vina. The question is that we
have currently migrated from mainstreams to distributed platform with the stuff running on
Rail is it quite closed and branching strategies. The requirements, testing
automation is kind of messed up. We do everything manual. Can I ask the question
if there is a chance of automating anything? How do you think it might be? Yes, you can automate
and you should automate pretty much everything
that is built there. It can be and it should be
automated but you'll have to find a way for that. I would not be in best position
to answer that question because I don't have an insight about it. And if you want to
really discuss about it, you can do it after the class. So I'll park the question
for the discussion if you want to have, but
after the class definitely. Question here from Hema that I'm trying to install Ansible for my
Ubuntu 12.40 machine. I've copied the RSA
key to the whole server but still whenever I try to run Ansible command says no host found. The host is not found
because you are trying to connect to which machine? It tries to connect to
the server by default and the host file unit
to add one more entry which would point to your local host. And it should be able to connect to it. Question from Juggernaut,
will we cover the Docker? Yes, there are two modules
specifically designed for containers and Docker. So we will be covering
it pretty much in detail, including how to deploy stuff,
how to work with registries, what are keys, what are
images, how to recreate images in an automated way, all of that stuff. So we will be covering
that, don't worry about it. All right, let's move ahead. So we were talking about the job and salary trend for DevOps. I told you that it is
something like a J-shaped curve but here you can see from 2014 to 2015 it's an exponential one
and it keeps on rising. It would be so for next two years I think, the current traction which
is in the DevOps stream. Later on it will move to very specifics because everyone would
have adopted DevOps by then or at least know it from the ground level. So people will be hiring in very specifics instead of talking about
just complete DevOps because they would not
be talking DevOps then, they will be talking about SREs, or Site Reliability Engineer rather than a DevOps Engineer. So be ready for that
cultural shift later on. What skills do you need for being a DevOps Engineer right now? Like I told you, you need to be an end-to-end DevOps Engineer but for the toolings CICD that
we are building right now or any of the job descriptions
that you will be getting they would be depending on these things. Tools, you need to know Git,
you need to know Jenkins, Docker, Puppet, and Nagios, which fall in different categories. So version control,
containers, integration, virtualization, configuration management, and monitoring. Five different categories. Then comes in your
general networking skills. Your DNS, HAProxy, DHCP, setting up of web application servers,
you need to do that. The network application
servers keep on changing, depending on what
product you are building, which project you are working for, technology stack keeps on changing. It will not be always Java. It could be PHP, it could
be Python, it could be .Net. Any language that your company works in you'll have to automate that. So that part would differ. Other skills which are
related to the cultural part, if you are acting as a
DevOps Transition Specialist like I do you also need to be aware of how to talk to people, how
to set up the processes, how to change the processes,
what are the drawbacks of changing it too fast or drawbacks of not changing it at all. So these things, talking to customers, building relationships with them, getting the MPAT account
in that organization, letting them know where is the strength and where you should be
heading, the IT awareness, then your market
standards, any of the rules that you need to follow. The guidelines that they are there, you need to be aware of that. Aware of various different products which are there in the market. Apart from the tools that are mentioned, there are hundreds of tools, literally, and every single week or two there will be a new tool which is
emerging in the market. A new one which will be
doing the job even better trying to solve some other problems. So there will be a lot of tools out there but you need to choose the
one which works for you. There can not be because your
friend is working on that, because it works for you. It might be the case
where the bigger tools may not be a good fit for
you because the bigger tools would provide you a lot of features than they were actually meant for. Initially configuration
management was made for one single step, configuring the machine, that's why the name
configuration management. These days, configuration
management systems will also help you in orchestration, like setting up the
cloud, calling the APIs, getting the machines up and running, setting up the firewalls there, changing the load balancer statistics. So the configurations, all of these, they'll help you to do so. But they were not meant
for that task initially. So if you want a specialized tool you might want to add another
tool into this category which is meant for
orchestration, Terraform. It specially does just orchestration. So setting up your cloud
environment without getting stuck to one of it. So we call it as cloud-agnostic design. So that's one of the other things. There's a lot of stuff to do
but let's for the beginners, if you need to ramp up on the concepts this is a pretty good
start which can take you to at least intermediate
levels, if not advanced. And other things they
generally come with experience. Question here from Eric
that if we use AWS instances instead of VM, is there any advantage or we would be learning more that way? If you are using AWS
that's awesome because anyways on the laptop as
well, we'll be spinning up virtual machines,
you could do it there. Plus when you're working with
AWS you'll get to learn more about the ecosystem that AWS provides you or the products that AWS
provides you to work with. If you have an account I would recommend you to go with that. But make sure that you
have ample amount of resources to spend on it because if you keep your machines on for
a duration more than 72 hours I think 750 hours is what
it provides in first year. You will be charged for more than that. That's the risk, other
than that it's fine. Question from Eric again,
please give an intro about yourself and your career experience. How did you end up learning
this all in your career and where you started. Well, I started as a developer initially and I was working on
a lot of stuff at once so I started my career
as a freelance developer so I had been working with
a lot of technologies. If you are asked me to code
in any of the languages which is out there, I
should be able to do that. Plus explore to a lot of problems, got to be experienced
in designing solutions to a lot of different scenarios. This was actually a good start for DevOps and later down the line I
moved on from development to the DevOps side of it. So this side was really
fascinating back then. So it's almost four years of
journey when I moved to DevOps. Before that I spent
another four years doing the stuff and building solutions. Not just one particular
feature but I have been involved in end-to-end systems, from coding of it 'til deployment. Initially we were not
aware of this term DevOps. We were actually doing it all. We were doing everything
that requires to be done, including a version
control to the monitoring and analytics but the world wasn't aware, we weren't aware of that word then. But now since it has been formalized I can now call myself as a
DevOps Engineer back then. It was just a guy who could
literally do everything. If you are wanting to be
a good DevOps Engineer I would like you to be
called as jack of all trades. You should be aware of everything
that is going around you. And again, if you have
a zeal for automation and you have a zeal
for reducing redundancy in the environment that you are in, you can definitely be a good
DevOps Engineer that way. So let's say in your
company maybe your DBA is spending five minutes
on a particular tool, just for monitoring it. He does it on four different
machines, let's say, or four different tools. So five minutes on each, calculates to 20. And if it does that work
almost 10 times a day, it's his job, right. Monitoring is his job. So you would be doing it 10 times a day maybe once every hour. So you're spending how
many minutes do you think? 200 minutes. So out of that 200 minutes
if you calculate it in hours, it would be 3.5 hours fairly. So if I'm spending 3.5
hours a day just monitoring because he has to log into
four different machines and then look at the logs and then figure out the stuff. Being a DevOps Engineer, you
could create a small tool that will pull up the
data from all of these four machines and put it
on one single dashboard which he can view in one single minute. So you're saving him
one ninety-ninth minutes of his precious life and you are doing it not just for him but for
the complete company. A new DB joins and you'll
be doing it for him. This is where the original
investment for DevOps comes in. A lot of stakeholders would ask about how do I get automated on the DevOps? If I invest X amount of money on it and you say your DevOps transformation would take anywhere from
six months to 18 months, how do I get return on my investment? I don't want to wait for that long. This is how it generates money for you, it doesn't pay you out directly. I hope that answers the question. There's another question from Napinder. The question is will you
be helping us in doing the stuff with Elk? Yes, after classes. If it is not covered here
and you want to try it out, I can help you out with Elk Stack as well. So it's true for any other
tool that you want to learn in matter of the next 30 days. Ask as many questions as you want and take me to any that you want. Cool, so let's move on. There will be a couple
of slides which will help me understand that
you are paying attention, although I could also
see here on the console who all are actually dozing off. Let's test your knowledge. Can someone answer this question for me? DevOps movement resulted in something, one of these three. Waiting time for code deployment is solved using something. And there would be a couple of questions down there which would not
be covered on the slides. So during that question
I will also explain you what that actually meant
and how we get about solving that problem. So the correct answer for here is 1a. So DevOps movement resulted in improving the collaboration between the
developers and operations. We do not put in next team. So people will be bringing
another team called DevOps and still your developers and operations wouldn't know about one another. So they will be treating
your DevOps team as a bridge. And rather than being as a bridge, it would literally act as a support team, supporting all the developers
and operations request. So turning your DevOps
team into a support team is not what you really wanted, right. So this is one of the
anti-patterns of DevOps by creating a DevOps team you are actually setting up DevOps. It's not. Waiting time for code
deployment is solved using continuous integration, we saw that. Continuously, vigorously testing your code every time the new checking happens and then making sure
that it is always cool and then moving it out. That means it's always
in a deployable state. And the big event of deployment is turning into no big event, so it's good. So we do continuous integration there. Question here from Ms. Bustle that does not 1a leads to 1c? So improving collaboration
between developers and operations, how would it result in delay in release cycle? It would actually result
in reducing the delay. Yeah, that's what you meant but initially your question was a little different. Anyone, can you answer
this question for me? Which one of the following
task does not belong to configuration management? We haven't talked about
configuration management much but I'm sure you get a fair
idea of how it would work. So configuration management, we have been talking about it, it configures
the machines for you. And increasing the
amount of servers can be effectively monitored by any
of the monitoring systems and the monitoring system
that we have here is Nagios. So the answer for three is B, it's not C. It helps the developer focus
on building the current code and it also proactively
manages the infrastructure. It says it does not. So helping developer
focus on building the code is a task of CI, which takes the code away and integrates it properly. Consistently provisioning the system and proactively managing
the infrastructure is something that configuration
management will do for us. So now getting more technical,
what is the pipeline and what are the different components? The pipelines would be very specific to the problem that you're trying to solve, the project that you're building or the company that you're working in. It may or may not have exactly
the same number of steps but logically sections
fall in the same line. So first section is the version control, second section would be the testing. You could test any number of
steps, you could take it down. Continuous deployment could also fall in various different
steps, if you want to. Monitoring is automatically taken care of. If you want to configure that
in continuous integration, you could also do that but
I would leave it to be away. So this is how the code moves. People don't really stack their
teams on top of one another so hierarchy that they
maintain is vertical. Assistant Engineer to
Engineer to Senior Engineer to Team Lead moves ahead to
Project Owner then goes to CTO. If Assistant Software Engineer
wants to deploy a feature which he thinks is really cool and will solve the problem in a better way, he needs to get approvals
from all of these people and then the approval chain
starts on the web mails. They start responding
to it a couple of days, a couple of weeks just
to know that it's now the right time to release
that particular stuff. We don't want this approval
chain to be that long so one of the structural
changes that should happen in an organization is to
reduce the number of hierarchy. The best way to reduce the approval cycle is to automate that. How do you automate an approval cycle? You could branch it off. You could create your
feature and simply show it. Looking at a feature virtually and seeing the impact
on screen with numbers that this is how your system look like and this is how it look like now, that's more impacting than simply initiating a mill chain that's saying here I want to do this which will change your system like this, like that without having concrete proof. And to test it you need to have a continuous integration in place which would be branch specific. So your branch specificness
is really important so you should not be able to mess with the master branch right away. You'll still work in isolation and the system is able to
be tested in isolation. And that should happen automatically because you want to expose
it out to the world. What if your Senior wants to check it at his VRs in evening when
you are not in the office? We will see how to do that. So here your SCM is actually
source code management, also called VCS or version control system. You could call it anything you want. You'll be using a software
called Git for that purpose. So it records all the
changes to the documents that we are building
along with the history along with the different versions. So the problems like
who did this yesterday? Why you committed this line? But how do we figure out the you part? Who committed it? So there has to be a history. And the reason why he committed it, there needs to be a message that I'm committing this code because
of these and these things or this is the reason
why I'm committing it. All right, that history
and all the versioning is handled by a CM. We'll see how Git works at it, what is the internal architecture of it, the way it stores the data. Why is it good comparing it with the other systems out there? I would be using Subversion for that. I think Subversion and
Git are being used mostly and most of you will be
aware of one of the names so we will compare these two. There are other players
in the market as well like CVS was there before even Subversion and then there's Perforce,
there's Mercurial. But I would be choosing the two leaders, Git and Subversion for that purpose. Then comes in testing. You could use any testing
strategies you want. For a Java project that
we will be taking up during the demonstrations,
JUnit will be used for unit testing. Then comes the integration testing, there are various different
frameworks like Cucumber out there which will be
helping you do the job. Selenium or Se is also
being used for GUI testing where you don't want to be sitting in front of monitor clicking the buttons. It can automate that stuff for you. So continuous testing is something that we will be building in. Then comes continuous deployment, which can be broken
down into two sections, continuous delivery and
continuous deployment. What's the difference here? Let's say you are testing something. After it is tested
robustly, you say that okay, this particular piece of code or this particular bundle or package that I've released, the
JAR file or the WAR file are your binary, whatever you have created after your testing steps are done. It's reliable and you want
to store it somewhere. You're releasing that zip or that WAR file or that piece of code. You want to name it, that this is perfect, it was done on this and
this particular time, the version and so and so so
that you can refer to it again. You don't have to go through
the complete CI section or continuous integration or
continuous testing steps again. So releasing your code
out is done continuously, that's why we call it
as continuous delivery, it's not deployment. Once you have the release the code, it doesn't mean you will always deploy it. At every step we filter out things. We need to filter it out
otherwise what will happen is they'll find the
developers in a company, everyone commits at least 10 times a day and it will be 5,000 changes. How many seconds are there in a day? 36,400 seconds and out of that much time you are deploying 5,000 times a day. That means every couple of seconds your machine would be overloaded
with some deployments. All the deployment can be done any time but I wouldn't recommend
it to be done a lot. A lot means it wouldn't give your machine some breathing time. Breathing time is required
for a lot of steps, a lot of things rather, to
refresh the connections, to get the new connections done, to build the pool, to build the cache. If you're taking care
of all of these things and making sure that
your machines are always in a warm state, you don't
need to rebuild anything. Only in that case you can redeploy a lot of times as you want. So what do we do for implementing this? Simple strategies to figure out things. Let's say five developers are
committing a lot of times, 10 times a day resulting in 5,000 commits. But 5,000 commits if you
tell Jenkins to pull commits only after five minutes,
every five minutes it will be pulling the commits. So let's say in last five minutes I would receive 10 commits,
yet in another five minutes I would be receiving let's say 15 commits. So Jenkins will be pulling 10 commits and 15 commits at a time. So that's the first
level of filtering out, you're bundling commits together
then you are testing them, you're testing them together. So maybe the tests pass, they fail, they do it on a complete block. So your code happens to be bundled up. So once you generate a package artifact, that package is released somewhere. After you release the package, that package will be released
multiple times a day. Now you could choose that I
want to deploy once an hour, I want to deploy once
every couple of minutes. Then you could choose the latest package which has been released
out and then deployed. You can slow things down
but not at the level that you would be deploying once a week. That's not advisable. Question from Hema that what are commits? Commits I'd also call this check-ins. The code, when you are ready
with the code on your machine and you want to share
it with other people, you want to push it out. That you are done with the code, you want to confirm that
it is going to work, you want to share it with other people so that other people can work on the other sections of the product. The product is a shared entity. Everyone else works on the same thing so you want to share it. So how do you share it? Definitely not from your machine. You need to push it out. To push it out you need
to let other people know why you are pushing it out and
what was the reason for it? And who did it? Of course, you did it so
you do some operations. That operation is called S-Commit. So you commit to the code that I'm the one who is done with this particular feature. For so and so reason I would be sharing this feature with you so it goes out to a particular machine. Either it will be setting locally, in that case you would be pushing it out. I'll talk about what this but for a non-bookish
definition, committing is to finalize your code and move
it to the closest repository. Repository as in the second
machine which is there, the closet one does
the committing for you. Question from Sidar, is it
possible for a CS system the build only selectively a single JAR within based on which
code has been modified? Yes, you can do that. You would be specifically
taking only the code which has been modified and then building stuff on top of it. It can be easily done. The way you're doing it,
you could have CS system use Maven for getting the dependencies to create your project in smaller chunks and call all those chunks as dependencies. For building one of the microservices you pull from those dependencies. So the code only for that
particular section has changed. You don't need to rebuild
all the dependencies again, you only pull them as and when required so that can easily
split up a lot of stuff. We don't need to rebuild everything. Question from Napinder that
do we learn something about Ant and Maven? You'll be learning about
Maven but not a lot. This is not the part about
learning a particular language and its build tools but
we'll be touching upon that. So Sidar, I'll have to look on that. The language-specific parts, I will not be delving deep into that
but I will have a look. If I get past something
I'll get back to you. Cool, so that for SCM. The next part would be
continuous integration then testing then deployment. So Git will be one that we are using. It's just a word that we
have heard it looks like. There is a local repository,
there is a remote one. So I'll talk about what repositories mean, what is a working
directory, what's the index, what is head and what
is the different points that we can use. But for now we'll keep it simple. So continuous integration what happens? Your code is integrated continuously, as in when you check-in or
you commit the process starts. If you have been paying attention closely your continuous integration
depends on your commits and the complete pipeline is triggered whenever a commit happens. So your commits can actually be treated as triggering point for your pipeline. Once you've triggered a pipeline, it's called a pipeline for a reason. You're pushing something and
then you don't mess it up, it goes through the stages automatically, comes out of the other end, it doesn't leave the pipeline in between. It can only be stopped, it
can not leave the pipeline. So let's say your testing failed, your code would be stopped here. So what do you do? You don't divert the code
from here, you again build it. You again fix the things
right on your branches and then again generate the trigger. That means again commit something and it again passes through the pipeline. So pipeline would be acting as a sieve, as a filtering point. You make sure that the code
is actually getting filtered through all of the different
stages and then getting past the pipeline so that you ensure the quality is always present there. So we will be using Jenkins. What does this Jenkins do for us? It handles automation. It is just a framework
without any of the features. All the features are provided by plugins so these plugins would be helping us to check out the code,
to get the code there, to build it, to test it, to push it out. So when I say pull the
code or check out the code, it means that connect to
a particular SCM system, validate yourself, authorize yourself, get the branch that you want
to, get the code downloaded. So all of these things
can also be scripted. You could write a script for that. But since we are phasing
a generic problem, why not use the solution which
is already in the market? So Jenkins would be having a plugin which handles that part for us and that script is a dynamic one so you pass it some configuration and the script will
behave according to that. So you use a plugin for that so Jenkins would be allowing us to setup
a plugin and do our job. That's what Jenkins does, it
will only help us automate. And what would it automate? It would automate anything that
we can do all by ourselves. If you can not do it manually, it can not be automated on Jenkins. Jenkins only automates
firing the commands. It might look like it is
sending an email for you but you could also do it from
the command line, if you want. It might look like the
data's taking the code from the branch on particular Git reports but you could always do
it from your command line. If you can not fire a Git
command on the command line, you can not automate anything on Jenkins which it'll take to Git. Pretty much straightforward, right. That's what. If you can not fire
commands using Jenkins user on that particular machine
on which Jenkins is running, it's not going to work on
automated basis as well. Question from Eric that one confusion that commit would be done by many small modules in a full release package
then deployment would be done when new full build or pack is updated after update of commit in
each small module it contains. Let's simplify it. Any stuff you want to release is actually a snapshot of code in time in your SCM. Whenever you want to release something you need a build out of it, you need a package of
whatever you generate has a delivery for you. But it is a snapshot of
something in time in SCM. After you add a commit,
your snapshot will change. Your continuous integration makes sure that even a small change to the package, even a small change to
that particular snapshot is always stable, that's what it ensures. And if your snapshot is always stable, you don't need to worry about whether the commit is coming in at what rate. Even if it comes 100 times a day and if your snapshot is stable, you could deploy it 100 times. For the reasons that I
told you that machines would not be able to cope up
with all the intermediate stuff that it needs to do during the deployment we would like to slow it down. Bundling it up is not always the solution. Sometimes one individual change, one single line of code, can also be pushed out to production. Think of it as, let's say a small fix. Someone coded something
and he missed a line in just the login screen
so your login screen doesn't come up after deployment. You know that the problem is very small, it requires a quick fix. You just add a particular
line and push it out. For that, you don't need
to worry about bundling everything and building all small modules and then packaging it and pushing it out. You'll do it really fast. Question from Srinivas, what are the other integration products available? You meant to say continuous
integration systems or integration engines that
are available on the market. There would be Travis,
there would be Team City, there would be Go, Jenkins
we are going to cover that. There is another one called Fabricator, which has got its own engine. There are a lot of systems out there. Question from Hammad that can I perform several patching using Jenkins? It would be best then using any configuration management system. So you could use Puppet for that, that would be your best bet, not Jenkins. Jenkins is only a facilitator. Question from Anup that build
can be triggered from a CM or from the place where
the test cases location like Confluence. There are multiple modes
of triggering Jenkins. I'll talk about it in tomorrow's class. But basically you could
trigger it from commit. You could also trigger
it from external sources, like Confluence. You could also trigger
it from your scripts, if you have a program to do that, or from a COM job or
schedule it on Jenkins itself after a couple of minutes. There are a lot of possibilities, choose the one that suits you. Question from Napinder that how that code is built or packaged? Does Jenkins do it for us? Again, Jenkins is just a facilitator. So when packaging the code,
if you want to fire something like MV and Space Package
because that's the target that you want to use and Space Build if that's the one that
package is code for you, Jenkins will do the same thing. It will fire the command
Maven Space Package for you. So it's not Jenkins that does it. You will have to configure
everything all by yourself and make Jenkins do it for you. Question from Nagendra
that in simple Jenkins is used to create a build from SCM. This build is what is tested. Yep, I wouldn't say
Jenkins is just used for building from SCM, it
is used for integration. So it's just an automated engine. It can be used for multiple things like pulling the code, building it, testing it, deploying it, pushing it as a release, and then triggering a
deploy in production. People can also use Jenkins in production although I really discourage
you to doing that. So in simple what you say would be yes but it's not the whole part of the story. Question from Srikumar, how
about Electric Commander? Again, any tool that you use. Question from Sidar that
are you going to cover Orchestration as well? No, unfortunately
because for Orchestration you need APIs for infrastructure and for APIs for infrastructure
you would really need to depend on cloud systems. So I would not be covering
up Orchestration here. If you want to automate virtual machines using Vagrant it's totally up to you. That would be a good bet if you want to do it here on your laptop. Question from Srinivas
that does Puppet support IBM AIX operating systems? I think yes, a couple of
friends of mine are using that. But I haven't used Puppet for
that operating system yet. So see it totally depends
just on client part and I think there are clients for
that operating system as well, just have a look. Question from Hema that
would you also cover patent basics in this class? Well no, I would not. There's a nice ebook
if you'd like to refer. Instead of referring
the computer reference which is a lot to take in for beginners. The ebook's name is Thinking Like a Computer
Scientist in Python and that's the name,
that's just the long name. Thinking Like a Computer
Scientist in Python. It should get you started. In just a couple of days,
if you really practice you should be good with Python. Kumar has shared this URL for AIX support and there you go Srinivas so you should be able to get to this link and help yourself out. So Jenkins helps you in
automating everything that you could do manually. On the testing front, you could use any testing framework that you have. But since it is a pipeline
it is triggered continuously using the commits, hence
the word continuously has been used multiple
times all over the places. We will be using Selenium. We might also use JUnit, it depends on what kind of test cases
we are going to run. We'll see that in tomorrow's class. Test scripts can be written
for multiple languages. Selenium helps you run
multiple browsers at once. It's not a capability of Jenkins again. It's totally depending on
which web driver you use. It has nothing to do with any CICD. It is totally your tester's job. If your tester says that we are running, we are just building the backend services, we don't need Selenium, it's totally fine, they use something else. Maybe you have Python code so you wouldn't be using JUnit for that, you would be using PyUnit for that. If you have PHP, you
would be using PHP lint instead of any of the other
drivers we are talking about. It depends right so don't
get stuck with the tools. That's again a word from me,
don't get stuck with the tools. Configuration management, we
are going to cover Puppet. Puppet is going to help us
in setting up our machines. So if we kill the machine,
you want to set it up again, you just bring it up
and configure it again using the script that you have. So writing scripts, we are
not going to write a script, you are just going to describe a state. We are going to tell Puppet but
any configuration management machine engine works the same way. We tell it, we tell it the end state. So we'll be telling Puppet that hey man, we need a machine with
Apache installed on it and the home page should look like this and the directory should be like this and I want a new user to be created and I want it to be authenticated
as this and this password. You need to describe it so there will be a specific format. For every configuration management engine there is a special
format that they follow. Puppet has got its own, Chef uses Ruby as a description language. Ansible would be using
YAML format files and same with SourceTec. Puppet has got its own language
that we would be talking of. I told you that the
learning curve for Puppet was a little steeper so
we're taking up on that. Then when you have
configuration management your configuration may be for a container or may be for a virtual machine. Depending on what you talk about your continuous virtualization
would take care of it. So continuous virtualization here you would be working on
virtual machines all the time plus we would also be
covering containerization so we will be deploying our
stuff inside containers, not tomorrow but later down the line. So we will be building
one thing at a time. And all the stuff that we are building integrates with the other
stuff that we are learning. So if we learn Git tomorrow I would be using Git when I talk about Nagios. So it is going to be
repeated multiple times so don't worry about that. But just in case, I don't
want you to be asking Git questions in Nagios class. In Docker we will be
talking about how do we spin the containers, what are they, what's the differentiate between points between the virtual
machines and the containers, where do we use it? What does stack looks like,
what are the points where we really lose on virtual machines or gain on containers, why
you should be using it? I am not totally in favor of containers nor am I totally in
favor of virtual machine. It's the problem statement that describes what you should be using and
then comes the monitoring part. The monitoring part we just
covered here in the class, it's not complete. When I say complete, it does not cover the application monitoring section of it. Although you could do that using a plugin which we will be using for sure but Nagios is not good at that. So use a tool which is good at the stuff that you want to do. If Jenkins doesn't work
for you, scrub it off, use something else. If for application monitoring you think that New Relic is the good container or if you log everything and
put it into Elastic Search it would be a good choice
for you, go with it. Nagios has been around for long so it's like a caveman's
tool for doing the job but it's really awesome
at infrastructure level and not a lot has been changed
at the infrastructure level. We still have got the CPUs,
we still have got the RAM, we still have got this. Since those things haven't changed Nagios haven't changed
for a lot of people. But on the application side
it's totally different story so use the tool that suits
your requirement there. Question from Sidar that instead of using a Windows Server in VM,
can we do the assignments by downloading and configuring
the CM tools for Windows? Yes, you can do that. But a lot of things will
change when it comes to configuration because it is
operating system specific. Things that work on Linux
would not work on Windows. Jenkins did testing as well as Docker would work the same way on Windows. Other things like Nagios,
Puppet, I would recommend you to still stick with
one tool if you want. Question from Michelle
that for our basic project what will help make us decide
the configuration of VM? I will not talk about the production. I will talk about the current
state on your own machine when you need to be creating
a lot of virtual machines. So see server relation Ubuntu version takes no more than 150 MBs
of RAM to execute, to run. So if you have lone
resources, 300 MBs of RAM on server edition of 1.2.14.04
is fine for any given day, any given machine, apart
from Puppet Master. I'll let you know when you
configure Puppet Master again that you should be having
at least 1 GB of RAM. Or even if it is 512 for a couple of nodes it's fine but you will
have to configure it. I'll let you know where to do that. Apart from that 300 MBs is cool deal. In production the more you have the better depending on the services that you have. I wouldn't recommend having a lot of RAM, say 4 GB or 8 GB RAM for Ingenix
web server running there. So you have to benchmark it. Benchmarking and load
testing, stress testing is one of the biggest challenges
that a DevOps Engineer has to take in. It gives you very crucial statistics about how things perform in
production and how much cost you would receive. It's not that you can simply
put in one service per machine and thinking that maybe
M3 instances are rigid and robust enough, you'll
deploy everything on that. It's not like that. For now 300 MBs is fine. On production you need to benchmark it and then see how much you
really need on the load. That's another reason
why we use containers because we are not sure. I'll talk about that when
we talk about Docker. Continuous monitoring, it
would be Nagios for us. Nagios what it does is take input from one side of the infrastructure
and passes it out to the server where it
generates that code from and then it will be notifying us using various different means,
we'll talk about that later. So now that you know
various different tools and the location that they fit in, that's your end goal. So this is your pipeline would be looking like theoretically. Practically it would be a lot different. I'll talk about that in
the next couple of slides. So here some questions. I didn't talk about Jenkins a bit much on the technical front
because there are some classes in which we will be talking
about it in more detail. So by the way, just for curiosity, Jenkins is written in Java. It was initially created
for a Java project and when people thought
that we really created a nice tool, a nice
solution, let's be generous and give it out to community. So these generous people have released these solutions out as open source. And there are a lot of
people who like to do that. All the Apache guys, Elk guys, they have released solutions out to public where we can use it and where
we can contribute it back. So it was written in Java. It was meant for Java and
they became very generous and people started to contribute very generous solutions to Jenkins, which made it to be wrapped
up with other languages. Now in the Git repository,
it doesn't make much sense right now, the answer is A. I'll tell you why it is
A, how it is structured, what is a bare repository,
what is a workspace, how it looks like, what are
different folder structures and all that stuff. But just get past this
slide, it's A for first and Jenkins is developed in Java. Can anyone answer this question for me? When should automation testing be used? It's a common sense question, let's see. Anything that relates to
culture, it's common sense. Okay we see a problem, we adapt ourselves. If we don't we lose. And question number four is about Puppet Master contains what? Scrub it off, it's manifests by the way. I'll talk about that in detail of what are manifests, what are
reports, how do we build it, what are different
modules that we could use? But I don't want to get ahead of myself because if I tell you what's
dot GIT then you'll ask me a thousand more questions
but instead of doing that, let's see how it looks like and then we can definitely come back to questions. Question five, containers
running on single machine all share the same OS
kernel and start instantly and make more use of RAM
more efficient use of RAM. The answer is yes. That is the biggest differentiator between your virtual machine and
the reason why people are using containers. It changes the whole idea
about resource allocation. When you create a virtual machine, you are actually pessimistic
about your resource allocation. Let's take an example with AWS. You want a virtual machine
from AWS and you say I want a virtual machine because I want to run a web server. AWS is very skeptical about things. AWS will say I don't care
what you run inside of it, here you go, you take 2 GB or RAM and 2x computation power and
give me this amount of money. I don't care how much of that
2 GB RAM you are going to use. That's the pessimistic approach
because they don't care. We care, even if we are
using one percent of the RAM we are paying for 100 percent of it. Bad idea, right. If you are running a bigger
company like Google or Facebook and you are wasting 99 percent
of your resources like that because AWS is just pessimistic
in its resource allocation, it wouldn't work for you, would it. No. So we make more sense of things. So we say let's do one thing. I don't know that I will be
running one service per machine and even if I do, I do it for a reason. I do it because I want
to maintain isolation between my services. If I could fix that isolation problem, if I could prevent one of the process actually killing or interfering
with the other process on my own machine, I could easily fit in as many services as I want, correct. The same way you are
running over 100 processes on your operating system right now without worrying about any of the process actually eating up your memory. Are you worried about it? You're not. Can you spin up another 100 processes on your laptop and not worry about it? Yeah, very optimistic about it. You would do it, right. Right now your operating system might be operating at
20 percent utilization, 30 percent utilization. That means you are wasting
80 to 70 percent of it. If you could fit in some
more processes on it, you would be making
very close to 80 percent and optimal use of your resources. That's the way Google
and Facebook would think, the companies at that level would think. Because they operate
their own data centers. They don't have cloud
services from someone else. So for that, they have
moved into container scheme. So they stuff running one service inside one virtual machine,
let's do multiple containers and not worry about
wasting a lot of resources. So I could make the
optimum use of my resources and push in as many services as I can on one single machine until
it reaches some optimal level. Optimum level could be 80
percent or 85, 90 percent, whatever you choose
for the infrastructure. Don't worry if it doesn't
make sense to you right now. I'll do this discussion again just for the sake of questions. The answer is true here and
what are the objects in Nagios? Well again, it would be
kept for the last class as the elements involved in monitoring. They don't change, they
don't literally change. It's very same old architecture. Even in hardware part, same
discs, same CPUs, same RAM. Capacities and speed would be increased but it is pretty much the same. You wouldn't find a computer without RAM. Look at this from a
case study perspective. So you ecosystem would look like this. You will be learning things
over the next seven days and at the last we'll be
combining everything into one, like revising everything
we have learned so far. All that during the other days as well I will be integrating the
new stuff with the older one. So slowly and steadily we
will move from left to right as code moves. So your code moves horizontally, it never moves up in the hierarchies. Let's look at what our
use case would look like. So in Module 2 we are learning
about Git and Jenkins. A lot of questions about
Git and why, how it looks, where to configure stuff,
what are hooks, what are not? Git commands, granting
strategies, merging, why do we use it, how do we
do that, what is stacking, what is unstacking stuff? I'd recommend you to go through the slides for tomorrow's class before you arrive. Then we'll be talking about Jenkins. What is CI, what is Jenkins,
how do we create jobs, how do we setup plugins, what are users, how do we create them,
how do we allocate roles, how do we setup the security,
how not to allow anyone to mess up the Jenkins? Authentication, all that stuff. How to integrate that with existing Git. Now that we are learning a new tool, we will try to integrate
it with the previous one. And that's how we learn
more about the new tool as well as the older tool. That's how I like to take things forward. Module 3 would be talking
more about Jenkins. So Maven, building, creating of test jobs, releasing stuff, deploying
stuff, notification systems. All of that plus we will
also have some code with us which will allow us to get our code from the initial level, that means
SCM part, through Jenkins, testing and then releasing and deploying. So we'll talk about the
class project as well. That is going to be our end
goal of total eight modules. I'll talk about what the
test case would look like in coming slides. Module 4 is about Docker. So people who have been
asking about containerization, it is a Docker thing. So in Docker we will be talking about very frequently used
commands, the use cases of it, why do we use it in the first place, the differentiation between that, how does virtualization
help, what are hyper visors, why is it difficult to
run Docker on Windows and why is it so that Windows supports Docker containerization
in cloud and not here? There are a lot of reasons behind it so we'll talk about that. I'll go into details of it. Conceptually it matters
a lot to hear the talk. So if someone of you
is actually scheduling to miss a couple of classes, please don't. If you are also thinking
about not attending right now and revisiting the videos later on, I wouldn't recommend doing that as well. So see it is a three hours class. If you go through the video, it's going to be three hours again watching the videos and it's
not going to be three hours it's going to be more than
that, pausing, running, pausing, running, and doing
other stuff in between. Don't do that, be in the
class, take down notes, concentrate on what is important because there are a lot of
concepts that I have been talking which is not on slides. It's totally out of experience that comes. You can take it down and then do all your research stuff on Google. You never need to come and waste another three hours on videos. Please don't do that. How do we distribute the emails so that it can be reusable? And one of the interesting
things about Docker, well Docker is not a container. Docker is just a wrapper
on top of containers. So container technologies is different. I'll talk about that what that is. And there are different players in market apart from Docker who are providing the this container technology. But we know about Docker
because it's famous. Yeah, we'll be talking
about containers a lot. An interesting fact about Docker is that it provides us a layered
architecture where we could actually have versioning
of our operating system. You know versioning of code, right. You commit something,
you get a new commit ID or you get a new revision ID. You can version it, you can
say my version is so and so. You can also do that with operating system and that's interesting. If you install a package,
Docker will help you understand, how many files change,
how many files were added, deleted, modified, what
are the changes that you are actually going to commit back? So you can commit an operating system on top of an operating system. It can seem fascinating, right. I'll talk about that in detail. So Class 1 is all about letting you know what's coming ahead. If you are overwhelmed
by all of these things I'm sorry for that but
it only will prepare you to look at the future and be
prepared for the next classes. Docker would be distributed
over next two classes, Module 4 and Module 5, total of six hours that we will be spending on it. Detailing from image creation
to automating everything to re-creating and snapshotting images, to release the containers out and to deploy it onto the end machines. Some of the sections, which
we are going to deliver before configuration management, they are going to be manual. So we would be doing things
at a very crude level. It's always good to know
things at the crude level because when you automate them
you will see the difference of what automation brings out of that. It is very easy to create a Google account and then simply complain
that email is not working. But it is even more interesting
to create a Google account sitting backside from the CLI. That's the interesting part. Even in your company when
you deploy something, so running configuration management system or getting stuff done while OVA measures, it's fairly simple because
everything is configured there. But when you literally create
a machine from scratch, set up the keys, create the user accounts, get the packages there,
resolve the dependencies, then configure your code,
make sure that it works, build the firewalls, and
then harden your server, all of these things when
you do it manually first and then you automate it for sure because anyone
who has automated it must have done the things
manually first time. And I want to teach you the very first ground level stuff about it. That's what we are going to do. Module 6 and 7 we'll be
talking about Puppet in detail. So if not extending Puppet,
you will be taking it up to the intermediate level where we will be understanding what Puppet is or any other configuration management system is, what it does at the ground
level, where to find the help. For any other tool also
I will be doing the same because finding help is the
most important part of our life. Where to find it, how to
use it, how to read it, how to make use of it,
it's really important. Asking questions will
not always help you out because it takes time to
first get the right question and then ask it to the right person. What the architecture looks like, what are the terminologies,
how the language looks like, how is it different from
other tools is Module 6. And then in Module 7, we'll make sure that every architecture piece
that we have built so far we will be converting that to Puppet. So we'll be using the modules, we will be using the templates
to configure a machine to push the files there to make
sure that your end machine, let's say you are deploying
it on some web servers. It has got Nagios, it has got Ingenix, it has got HAProxy
configuration on top of it, it will be that. So we will be using and
implementing Puppet in Module 7. Also learning about some Puppet internals, Puppet language basically. We do have a problem statement
that we would be taking up. I'll talk about this problem statement in a couple of minutes. There are a couple of questions here. Hema is asking a question that can I ask Ansible-related questions
after this class? Yeah, please do so. Question from Irwin, what
is the main difference, advantages between Docker and Vagrant? It's totally different. See Vagrant works with virtual machines. So if you're working
with virtual machines, Vagrant is the tool
that you would be using for automating stuff. For Docker it is a
container management system so it's totally different. I'll talk about that in the
containers class, Module 4. So that should give you an idea, virtual machines versus containers. It's something that you compare
like apples versus oranges. You can not do that. Question from Napinder that does Jenkins package code with Docker? See Jenkins can do anything
you can do manually. Straightforward answer is yes. So you can have a Docker
file that will get the code and build an image out of it and then you could deploy the image
rather than deploying your code. So it's always a good practice and Jenkins can help you do that. Question from Hema that I have
got Ubuntu 12.04 installed on two VMs, does it work? It would work. I would recommend you to
use the latest one, 14.04. If you want you can also
jump to 16 version of Ubuntu. Make sure that it's server edition. I don't want any colorful screens as well because it takes a lot of resources. Plus if you really want
to be betting with me, I want your life to be black and white hence from now on the next eight sessions. Black and white screens, no
colorful screens whatsoever because there is a reason behind it. The reason is I want you guys to imagine because when you talk
about cloud infrastructure and architectures and virtual machines and all of that stuff, you
really need to be feeling with your imagination
that there's one machine sitting there right there. You can not see that, right. You just need to imagine it
and there needs to just be black and white screen. The way you jump from boxes to boxes, the way you copy the keys over, I want you guys to have a feel of it. So Ubuntu server edition, 14.04 as a VM, that would be great. So it's 10:25 here on the clock. I will be getting back
in exactly 10 minutes, which would be 10:35,
then we will continue with with the remaining slides. Question here from Vishal, a quick one. Will we be getting the
list of tools required so that I can be prepared
for upcoming modules? Yes, why not. You have already got the list, right? Right on the slides. So you could do that plus working with me is only recommended if you
could only really work that fast and you have already got an idea of what's going to be happening, otherwise you will lose track of things. So it's always good to take notes and then do it after the class. So you should always be prepared. Downloading a tool is not a big deal, it should be very quick. Question from Eric, any
good book you recommend to know the details of
operating system concepts like processors, resource, memory leakage and all that stuff? The old favorite called
Operating System Concepts. You could Google it out. It's the book that is
taught to college guys, early graduation stuff. No one talks about it unless and until you are really worried about that part. The whole idea of being a DevOps Engineer is that people take you for granted for these kinds of concepts so
you need to be prepared. Good question that you have picked it up. The book is called,
Operating System Concepts. It will talk about everything along with the algorithms that are being used. Question from Napinder,
will we be getting the PPT? Yes, go to the LMS and PPT
will be shared with you. It's shared for everyone, right? You should be able to get it. Question from Temish, will you be using VM in tomorrow's class? Yes, pretty much I will be using it. About the problem statement,
so what we are going to target with the continuous integration, continuous deployment pipeline
that you're going to build. What is this pipeline solving, what problem are we looking forward to? This problem is that we have got a website which we are hosting. So this problem is specific
here for Edureka's use case. Thousands of learners,
6,000 classes in parallel, and they have got a large audience which is actually spread
across 70 countries. So it's global, we need
to be operational 24x7, I can not say that. We will be down for odd maintenance for let's say two hours, four hours. Doesn't work that way. It could be night for me
but it could be day time for a couple of other countries. We can not do that, that's not an option. They have a new homepage
which runs on Tomcat, which fits properly
with our Java use case. They want to launch it
and they want to expose it only to people who are in U.S. So there are various different problems that need to be solved. The very first, 24x7 uptime. The next thing is that we have to support a large number of people. That means you should be able to scale up as well as scale down because these people will be available only around weekends. A large population would
be only around weekends like we are visiting
Edureka's website right now, during the weekdays we
would not be doing it. And it would be only exposed
to a couple of people from U.S. region, not from India, not from any other continent. They want it to be exposed
only to the U.S. guys. So these things, how
would we able to serve it? Now we will be building
a CICD pipeline for that and that is going to be our end goal. Other concepts which are not on slides and we will be covering it which
definitely should be there, we are going to talk about scalability, we are going to talk
about high availability. We are going to talk abut load balancing and load balancing types. We will be talking about availability zone subnets, clusters. We would also be talking about monitoring between various difference
regions and availability zones. So we'll talk about that
as and when required during the tools when we pick up the tools we'll be talking about these things. But if I skip on a couple of these points because this is something that I would like to take up whenever we've got time between the slides. Take it down and ask me
questions whenever you would. Also come prepared with
a couple of articles. There are a couple of websites which you should be looking
at, DevOps for Coms 1, there is one other one called DevOps Sky. There is Agile.com which
would help you with some more information on
DevOps and all that stuff. These are the cool websites
so you should read them, actually subscribe to them as a feed so that you are never lagging behind with the data that you have. So these things, what they are doing is they are configuring everything manually here in the problem statement and they want us to automate that. Shell scripts are not working much fine because shell scripts need to be triggered on all the machines, there
needs to be a triggering point plus if something fails in shell script you can not revert it. If you fire a command called Hello Word, can you revert it back? No, right. Anything that you fire on CL
it can not be reverted back unless and until you know
which points it impacted. So you need a system to do that and that is the reason why we are going to use configuration management. We are going to use Git
for pushing the code out, continuous integration
would be done using Jenkins, deployment would be done using Docker on containers in the machine. These are machines, Machine 1, Machine 2, Machine 3
are all Ubuntu machines. Here if you see multiple services are being overlapped
on one single machine. In the ideal world in production
you would never do that. You would never have multiple machines sharing responsibilities. It is always a one responsibility model, so one machine would be
responsible for one single aspect or one single service. So your database machine would never run web server on top of it. So Jenkins would not
share responsibilities with Puppet Master. You need to be able to isolate that. Here it is the case because we have got less number of machines. We need everyone to be on the same page and people doing stuff along with us. So if you have the ability
of creating more boxes, please go forward and do it. So create a Puppet Master on
the other of the machines, Jenkins on the box, Jenkins
agent on another one, deployment would be
happening on yet another box. Your production machines
would be at least three, two servers behind the load balancer. So that kind of scenario. So we'll talk about that. As and when required, I'll also tell you when to overlap the machines
and when not to do it. It's totally up to you, depends on how much resources you've got. Then we're talking about
Selenium and testing. We'll be using the test
boxes for that purpose. So it would be doing a lot of stuff and the setup would look
like this, similar to this, but we will definitely scale it up because while I'll be
talking about scalability I would also be talking
about high availability, imparting written and senior
infrastructures so that you wouldn't be worrying about if the machine goes down what happens. That code has to be built, that solution has to be built right from the architecting fields, initial fields themselves. If you think about it later on then you would be going through a lot of pains. If you want to check it out, read a box or a finished project, it has got an awesome
explanation about the use cases, the case studies, and
the way people at DevOps the way they changed it. It's a nice book to read. Question from Vishal
that will we be learning how machines communicate with each other for services they run? Yes, we'll be learning that. Of course not on the slides again. Park this question where I will be talking about deployments. Bring up any other machine,
I will also show you how to create password authentication because that's important when
you talk about automation. Automation we try to remove
any human intervention so things like entering a password, typing yes or no during installations. All those things are totally ruled out so we need to find a way to do that. So learning how machines communicate, over IPs or overt the
IP stack which is there, what are the different
protocols that they use. When to use your DP,
when not to use your DP. How do we do health checks
at Layer 7, Layer 4. What are the differences
between both of them? How do we configure our load
balancer to do that for us? All this stuff, so that
will be coming next of course between slides. So every class is different. Depending on the kind of audience, your interaction level,
your eagerness to learn. It's really up to you
how much you dig into. I'll be shelling out
the knowledge as I can. Question from Hema that can
we install Vagrant on Ubuntu? Yeah, why not. You could do that. You would install
Vagrant on Ubuntu and ask Vagrant to spin up
another virtual machine, which will have Ubuntu inside of it. Yeah, why not? You could do that. - We will be creating
feedback loops as well. The whole idea of creating
feedback loops is to have insight into what CICD is doing. What your bots are doing and
how they're performing like. Your feedback loops should
always be targeted on being shorter but at the
same time more amplified. What does it mean? It means that shorter
means you should be able to get the results back soon. It doesn't mean that
I'm committing right now and I'm still waiting whether
the tests are failing are not. I don't know. After a couple of hours I get to know that the deployment failed. The whole time I was
just waiting and thinking about what's happening with my code. That should not be happening. We should get the response
out as soon as possible so that's shorter cycle. And amplified as in it should
give us actionable items instead of telling me
that the deployment failed and leaving me wondering what happened, my system should be
smart enough to tell me that buddy, this
particular line is missing, this particular semicolon. You go and fix it, that's actionable. So getting actionable items out is better than simply getting notified. Sending the notification is good, sending a lot of notification
is really, really bad. People tend to overlook it. If you get notification
for every single commit that is happening or
every single deployment that is happening on a production, or every warning that is
being generated in production. People generally tend to overlook it. And sometimes when a
real error is happening, they simply think that
it is a normal thing and they simply don't
change the notifications. We don't want that to be happening. So you need to filter out what
is important, what is not. If it's not, should we spend
the notification for it or shall we handle it as
a part of our run book? Can the system self-heal? The system should be self-healing. The only reason why we use
configuration management, making it re-creatable,
that's totally fine. But when you assist that with Run Book, with some smart AI capabilities. I wouldn't call it as AI
but simply do this, do that kind of if else statement. But it is important when we do that we complement it with
this type of capability then your configuration management system can create a system which
can be self-healing. So if you lose a machine, your configuration management system will be automatically triggered and will get that machine back completely configured, back
in cluster and running. That presents a lot of 2
AM, 3 AM calls for ops guys and they will be more
than happy to do this. So a DevOps guys never receives calls at 2 AM or 3 AM in the morning. If he does, it means his DevOps
setup is not good enough. I never got these kind
of things in my career so when we build it we sleep very soundly. We know that if something fails, it is going to be self-healing. We don't need to be worrying about it. We will check the logs tomorrow morning and probably find a root cause of that what happened and try to figure it out and try to fix it so that
it doesn't happen again. That's the whole idea how
you deal with this problem. How do you trigger it? Monitoring, so monitoring has to be there. We need to trigger things
using the monitoring. How does monitoring trigger it? Monitoring will only give you logs, generate some real-time snapshots for you. You need to analyze it and
there needs to be another server which will be analyzing
the things continuously. At the same time it will
also check the patterns. You know machine learning,
some people do deploy machine learning for the same reason, for just getting notified. If you want to do that, AWS has got nice machine learning capabilities,
streaming capabilities and queuing capabilities
that you could use it or you could create one all by yourself. There are some paid
tools as well out there which will run on top
of your storage systems like Elk or Splunk can
generate notifications for you. So you could do that and
you would be using that just for that. It's not again I repeat it's
not a great tool for analytics, it's only for monitoring purposes. There's a big difference between
analytics and monitoring. So you need to have a tool that will help you do the things the right way. Question from Temish that Splunk can also be used for monitoring? Yes, it can be used for monitoring but you will have to make sense of what data comes into Splunk. Splunk has got a dashboard on
which you can query the data. Nagios has got a dashboard
in which it knows what data is coming in it would
make sense of automatically. In Splunk you need to create
dashboards for yourself. That's only the overhead but
otherwise you could use it. Question from Nikal,
can you please suggest any free version control tool? Actually GitHub is not free. Is there a difference
between Git and GitHub? I'll come to your question, just park it. Question from Srinivas, what
is the scripting language we are going to use in this session? We're not going to use
any scripting language, we are just going to use
the code which is written in Java for demonstration purposes. So I'll show you how the check-ins work, how the code flows, how do we bundle it, how do we question it,
that will be taken care of. Scripting language, if you want to learn if you don't know any you
should start with Python. If you're interested in
knowing shell scripting for basics there's a nice tool. It's a one-page tutorial, just hold on, I'll find one for you,
I'll share the link. Knowing a good resource always helps because there are a lot of
resources on the internet. Every person, every author
would describe things according to their own understanding. And of all that I found this one for shell is really easy for beginners and the ebook that I've told you for Python
would really help you out. So I've shared the URL for
UNIX as in Linux scripting. Another question here from
Srinivas that right now I'm working on Ruby and Ansible. Yeah, no problem, you could start anywhere you want. If you are already working on Ruby then Chef would be really easy for you if you want to get started. I have shared the URL on HI boxes so you must have received
the message from me. Cool, so talk to me about these questions if you are relating DEV right now. Head and Git contains,
okay, forget about it. Another one, the tool
in which virtual images are created for maintaining the uptime of the environment is,
can anybody guess this, question number two? Okay, talking about images a lot, right. I was referring to Docker. Cool, thanks for that. Next question, in our use
case continuous testing is done using what? On the slides I have
been referring to a name, so that is the answer. Another one is Puppet is,
okay I'll make this easy, this is the master-agent,
also called as client-slave kind of architecture. That's the server that a lot of clients, I told you that this kind
of architecture is really helpful when you want to
create scalable systems. Does it mean that Ansible is not scalable? It is readily scalable. You could have as many kind of paddle connections as you want but the number of ports can
be a limitation somewhere and managing the machines in that way needs a lot of network bandwidth as well. So it depends on what you need to do and how big the infrastructure is. So if you ask me blindly
which one I should use, I would tell you that if you are using less than 100 machines, 200 machines, it's fair enough to use Ansible. But if it's more than 500 boxes use any one Puppet or Chef that you like. Even SourceTec would fit in that case. So in this module what we have talked of, why, how, why and how DevOps is right and also what DevOps actually is. We have talked about CIDC, the pipeline. I told you about what continuous delivery also differentiated that between continuous delivery and
continuous deployment. I helped you understand how the ecosystem or the different modules are distributed and what we are going to cover
in which particular module. And the use case, the end problem
that we're trying to solve along with the pipeline. So here the pipelines are different. Let's say in tomorrow's
lesson you will see what the pipeline looks like, why is it called a
pipeline, what are jobs, how do we structure them,
how do we trigger them, and what particular pipeline
with how many stages are we going to build? So our pipeline would be having
almost eight, nine stages. The one that we have seen during the class roughly has four, five stages. It depends, so it depends
on the problem statement. Question from Eric, the question is what is the difference between continuously delivery and
continuous deployment? Okay, so when you have a deployable and you store it somewhere you deliver it. That's the end product of
your continuous integration. The way you store it after
your continuous testing record it as continuous delivery. You've delivered something now. Whether or not you deploy
it is a different part. You have delivered the end product and that's called continuous delivery. When you create Docker images from your continuous integration after your testing is done,
that part I would be calling as continuous delivery. Once you deliver it then you decide on whether or not to deploy. So from those artifacts we choose the one that we deploy continuously
out to production. If that is happening, it's
called continuous deployment. I hope that answers the question. Question from Temish, you
mentioned Agile and DevOps have to go hand-in-hand. Why is that? I told you that Agile would be
coming with its own problems. You need to keep your software
always in a deployable state. Agile helps you to deploy
your product in small chunks and every chunk can then be
testified by your clients. For testifying it by your
client you need to deploy it. So there are problems of its own. You need to continuously
test, continuously release, continuously deploy, continuously
throw it out to the client probably just a small chunk
of people, not all of them. So you need to do this repetitively. That's the reason why you need DevOps so DevOps would help you fix these things that Agile is creating as its by product. That is the reason why we call DevOps as complemented by Agile. Question from Mahish, where can I get Ubuntu image and installation? Google it out, you can
type in download Ubuntu. You can download Ubuntu server and on Ubuntu office shell web page you should be able to get the download
link for an ISO file. If it is not ISO file, if it is a Torrent then you have to develop Torrent. There is an option for ISO
again, you'll have to check so once you get it down you will have to Google something like setup
VM with ISO virtual box. Setup VM with ISO virtual box. You will type in this, you
will get complete steps on how to load the ISO,
how to create the VM and all that stuff. Question from Kumar
that do I need to submit screenshots as part of
Module 1 assignment? Yes, please do that. Question from Eric, how
application communicate is application stack or UDP
TCP or need to learn it with references Linux OS? You don't need to learn
anything new right now. At our level we would be communicating over TCP but when we talk about monitoring I'll also help you understand
why UDP would be preferred in a couple of case studies or scenarios. When we monitor our infrastructure that could be done at two different levels that are in reference to your OSI stack. At application level as
well as at the network level so there are two different
layers that we need to work on. Another one is from Nikal,
yes Nikal I've unmuted you, I see your hand raised. You have a query? - [Nikal] Yeah, hi Ashish. - [Ashish] Hi. - [Nikal] I just have a query about Ubuntu so that is related to the same. But I have one week here. My Ubuntu application is running. So I need to go for high availability so what is the best possible solution or use case that you can suggest here? - High availability?
- Yes. - [Ashish] What is the service? - [Nikal] The service is a
GAB service, GAB application, GAB hooking. - [Ashish] Okay, there's one word answer to one of these things high
availability or scalability. The answer is redundancy. You need to add redundancy
to your application. But that's how you build high availability because if one dies you
should still be able to serve your customers. That's the whole point of it. To architect that you need
to have a load balancer which will be pointing
to various different web servers on the backend. I'm not sure if there is
sessions required or not. If there are some sessions required on it you'll have to push that sessions out from the web servers
to some central storage so that web servers can share the session. Now load balancer would
be forwarding the data to these web servers. Now about really, really code level high availability concerns,
you have made your web servers highly available but what
about the load balancers? If load balancer dies then web servers would not do any good, right. So your load balancer has
to be highly available. So this can be done using setting up at least two load balancers which will share the same IP address. So one of the machines
would always be having a different IP address first. Actually both of them
would be having IP address different, A and B. If A is your primary and
A dies for some reason then B should be able to figure it out. And as soon as it figures
out that A is not working, it is not in network, it
can automatically take the IP address off A and run
the connections through it. That's how it is done
at failover mechanism. Or we use ELB. - [Nikal] Actually I was
just finding the tools for the minimum cost
on the freeware tools. Can we achieve this or no? - [Ashish] Tools are free
but the cost of machines isn't free, machines would
be actually at a cost. - [Nikal] If I put U2 with VS can I do the same setup that
you are talking about with the free tools or minimum budget? - [Ashish] Yes, you can do so but you need at least three, not two. - [Nikal] Okay, so if I go with three can I do only with the main other tools, I will honor only on that tools? - [Ashish] Yeah, HAProxy
is for load balancer. - [Nikal] Okay, I'll just write it down. - [Ashish] Yeah, HAProxy. - [Nikal] Yeah, HAProxy, okay. - [Ashish] And anything else
that you're already using. HAProxy is only used for load balancers. Everything else you already know. Your web servers and since
there are three machines, if you want to store the sessions I guess you are already using database to push the sessions to database. That should work at your level. - [Nikal] Okay, to sync the mySQL database all over three VPS. - [Ashish] Yes, so database
would not be on the VPS. Is it still on the VPS? - [Nikal] Yeah. - [Ashish] It is, okay. If your database is on the-- - [Nikal] Sorry to interrupt. I have a whole application on single VPS. - [Ashish] Okay, so very first thing is that you need to break it down. You need it to be highly available because database copies only one can be primary. Other ones are going
to be read-only copies. If your primary dies then
your high availability goes out of window, it doesn't work. So what we'll do is we'll break it down. A database is a separate concern and it has to be handled using
separate set of machines. And your web servers would
be a separate set of machines and your LB would be a
separate set of machines. So we chunk it down in
various different components and we then think about high availability of a particular component. So database, high availability
would be only when you keep three database instances, one primary, one backup, and another one to the IDR side, another one. Then your web servers would
be at least two in number and two load balancers. So a total of how many machines? - [Nikal] Three, two, seven machines. So seven machines I need to procure and then go for HAProxy setup. - [Ashish] Or even easier at
the level that you're working don't worry about high
availability all by yourself. - [Nikal] Okay. - [Ashish] If you want to use
some cloud provider services use that like platform-as-a-service or use infrastructure-as-a-service from AWS or any other cloud provider. What they will provide you is a database, which is totally managed
so you don't have to worry about high availability of the database. Only thing that you would
need to worry about is your web server so you could
create two web servers. So two machines you would be needing. Another service that
cloud service provider will provide you is with a load balancer which is totally managed
so you don't need to worry about failovers and
those kind of scenarios. You will have a totally
reliable load balancer so you would be charged
only for your machines plus the data that you pass
through your load balancer and the data that is
stored in the database so it should be cheap,
it should come in cheap. - [Nikal] But they charge
hourly basis, right? And I think that are costly. You know it's that calculation
I need to do at my end. - [Ashish] Yes, you should
and that would be cheap if you use low-end machines. At the traffic that you are serving you are the best person who
would be able to figure out how big your machine is. If you use the free tier resources, in free tier resources
you would not be charged. If you used let's say a gig
of micro instance for your web servers you will not
be charged money for that. - [Nikal] That is free
only for a year, right? - [Ashish] It is free only
for a couple of hundred hours every month. So it's like if you use
two virtual machines it would be half of it. Okay, one virtual machine can
be executed free for a month so if you run two you would
be running out of that particular allotment within half a month. It's not much, it's cheap. You do the math you will see. - [Nikal] Thank you. - [Ashish] Yep, you are welcome. I hope everyone has
got their Linux access. You must have received
your email ID and password. If not, you'll have to sign up. Just log in, it will
present you with a screen which will look like this. If you move to Courses, My Courses. For you it would be this page. For me it's slightly different. So once you are on this page, depending on how many courses
you have registered for you will see those courses there. So right now clicking on
DevOps Certification Training and there will be three sections for it. Getting Started, Pre-Recorded
Classes, and Course Content. You must have already gone
through Course Content. If not, I really suggest
you do that in detail. At least skim through the slides which are supposed to be
covered in the next session, it's very important. Then under Getting Started section I think we have two parts here. One is just for you, the second one. First one getting started with DevOps. It will also provide you with a link for downloading the VM. Right now it's a little bit messed up, I'll get it fixed. Pre-Recorded Classes would be having recorded sessions from
the previous classes. So if you want to prepare yourself before coming to the class,
which I would recommend, please do it because you
should be having a fair idea of what's going to be
delivered plus you can point your questions
straight towards the problem. Otherwise, people generally
don't tend to ask questions and they tend to park it a lot. Four or five days later
they would come back to me asking questions from the Module 1, which is not a problem at all but I would recommend you
to keep with the flow. If you don't, you will
miss out on a lot of stuff. And what is being delivered in the class apart from the slides
totally depends on you. The more depth you want to cover, I will be here to help you out on that. It depends on the audience as well. The more you interact,
the more you ask questions I know that okay, whatever is delivered is being assimilated, it
is being absorbed properly and I can go to the depth. I can move to the next level there. Now in the Course Content
you will also find recordings for the current class. Right now this is the first class, you wouldn't be having any recordings here but there will be a link appearing here in this Course Content as well which will show you your current recording. Today's recording will be uploaded in a matter of two to three hours. You can go and check it out. You will also get a notification back. So here we have pretty much everything. For people who would like to
get in touch after the classes, there's a nice section
here, we call it as Support. At the top right there's
a button called Support. You click on it, it will
allow you to send a question. Let's say you are working on Jenkins, you are stuck somewhere, you are not able to find a solution on
the internet as well. So whenever you are
trying to find a solution you should be at least be
asking your colleagues first. Even before that, if
you are a good Googler, you should be when you are joining any of the technical streams,
search out for a solution. If you don't get it there
and you need an expert advice always come back to this section. But I would recommend do
some research yourself first. If it is just a stupid question. Stupid as in it's a really simple thing like how do we install Jenkins. You keep on asking about
it, nagging about it in four or five classes,
that means you are not really helping yourself. That's a bad thing. You should be helping yourself first. Select course, select the
category of the question, there are a lot of them, and then type in a description about what's wrong, what needs to be there and attach a file with a screenshot if you want. This is something I'm stuck on. I'm not able to get past it. So you could do that. So support is a good thing. Another one that has been
recently launched here in Edureka is the forum so you could chat
with your batch mates here. Click on the forum and
say Introduction to DevOps or you could choose the community as well. Community forum and your class forums can be switched by a button
here on the right hand side. So type in your questions and get the answers for it right away. That's even better. So that is the panel that sits to answer all of the queries. If you want to push it forward you could specifically mention that I want this particular person to
be answering my question and they will be forwarding it to me. That's only done because I
wouldn't be available 24x7 on emails so the team is already there who would be helping you out. If they are not qualified enough or they are not able to answer
your particular question, they will definitely forward it to me or if you specified. Cool, so that should help you out with all of your questions. Recordings, guides,
then course curriculum, and the recordings that
you will be getting after the class, the most recent one. All right let's move to our course. One more thing that I want you people to be prepared with for
tomorrow's class is a VM, a virtual machine. That virtual machine is
again provided by Edureka and you should be able to get it. Or you could create a VM all by yourself. If you are aware of the
software called Virtual Box you could download
that, it's free of cost. We will be using tools which are free and open source so that
you guys can download it, use it for as much duration as you want. Try it out without worrying
about any of the licenses. So download it and configure it. The operating system that
you want is Ubuntu 14.04. You could also download 16.06 but I would want server edition of it. The reason is very straightforward. If you download it extra
byte version it would consume a lot of resources on your laptop. Since you are using a laptop or a desktop you have quite limited resources, right. The amount of RAM that you have, the CPU power that you have is limited. And when I say that,
I would be spinning up anywhere around 10 to 12
machines by end of the class. It means that you would need
all of those virtual machines up and running and your machine should be able to support it. For people who are low on RAM, so if they have 4 GB RAM, it's still okay. We can do some tricks to fit in those many machines in there. For people who would love to use AWS or any of the cloud services,
feel free to do that, the products or any of
the particular services, but I will always make
a point to compare it with the cloud providers, the famous one. So whenever I get a
chance I will compare it, the way virtual machines we are building and the way they are there. Another one is for people
who are working with RedHat or Centuis. For them I would compare the commands but I would still
recommend you to use the VM which is being used in the class. At the first level at the very bottom, ramp up level where we are building stuff, I want everyone to be on the same page. Otherwise you messed up
on a different problem, someone else is asking
some other problems. It's fine but it takes a lot of time and your question would not be relevant to the other people. So I want everyone to be on the same page. For Mac guys, again,
you can definitely use VMs on your machine. For Windows, Mac users, if you are already using a different operating system you could create a VM,
that's what I'm saying and install Ubuntu inside of
it and work on Ubuntu boxes. Mac will not be covered, by the way. There is a question here,
a concern from Sidar that Edureka OVAs were sent
to us and not Ubuntu. What I was told that
the new one is on Ubuntu so let me get it fixed for you. It will be uploaded soon. So it's on Ubuntu, the version is Ubuntu server edition 14.04. All right guys, thank you so much for being with me for so long. I'll meet you again
tomorrow at the same time with the new module. I hope everyone comes prepared with it so that we can have
more targeted questions. Cool, have a great day ahead. I'll see you tomorrow. Thank you so much. I hope you
enjoyed listening to this video. Please be kind enough to like it and you can comment any
of your doubts and queries and we will reply to them at the earliest. Do look out for more
videos in our playlist and subscribe to our Edureka
channel to learn more. Happy learning.