Kubernetes Monitoring Made Easy with Prometheus | KodeKloud

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone this is Sanji from code cloud and today we are going to take a look at a open source monitoring tool called Prometheus and for this video we're actually going to focus on deploying a Prometheus instance onto a kubernetes cluster and set it up so that we can actually monitor the cluster itself as well as any applications deployed onto our kubernetes cluster and the reason why I wanted to make this video is because if you've ever deployed Prometheus onto either like a bare metal server or a VM it's going to be a little bit different than when you deploy it onto a kubernetes cluster because what we're actually going to do is we're going to make use of a Prometheus operator and so this operator when you're defining configurations for it you don't actually mess with the Prometheus configuration directly the operator actually has its own custom resource definitions so that we can actually Define all of the configs using traditional kubernetes objects so we'll get a chance to learn all of the different custom resources we'll take a look at what our service monitors we'll take a look at how we can Define different endpoints to scrape using these custom resources we'll take a look at how we can configure your new alert manager rules as well as recording rules and we'll do that all through the custom resources and I'll walk you through all of the different components that get deployed when you set up the Prometheus operator because there are a couple of different moving pieces and I want to make sure you guys really understand what actually gets deployed and what is supposed to do what and so we're going to go over all of this and I think by the end of it you'll have a good understanding of how to actually get Prometheus to Monitor and you know really focus on you know collecting metrics for all of your applications as well as your kubernetes cluster okay in this section we're going to talk about how we can use Prometheus to monitor an application running on a kubernetes based environment and we're going to cover two things we're not only going to cover how we can monitor the application that's running on a kubernetes cluster we're also going to learn how we can monitor the cluster itself because there's going to be a lot of metrics that are available when it comes to monitoring the underlying infrastructure of your kubernetes cluster deploying Prometheus on a bare metal server is out of scope for this video we have a comprehensive course on the basics of Prometheus on code Cloud so check that out in this video though we will focus on deploying Prometheus onto a kubernetes cluster so as we know Prometheus can be deployed on a server to monitor the different applications and services and we can easily set that same server up to monitor a kubernetes cluster however instead of having the Prometheus instance running on a separate server it's actually better to set up and install the Prometheus instance on the kubernetes cluster itself and there's a couple of benefits to this first of all the main thing is you want to deploy your Prometheus server as close to your targets as possible so if your targets are going to be applications running on a kubernetes cluster it's better to have the Prometheus server running as close to that as possible on the cluster itself and in addition to that we can use make use of pre-existing kubernetes infrastructure so we don't need a separate server we don't need a separate VM to host the Prometheus server we can just run it on our underlying kubernetes infrastructure now like I mentioned there are two things that we can monitor when it comes to kubernetes we can monitor our specific applications running on the kubernetes cluster and so that could be potentially things like a web application or a web server or really any application that you have running in your environment but we can also use it to monitor a kubernetes cluster in itself and there's a couple of things that we would want to monitor first of all we would want to monitor the various control plane components so things like the API server the cube scheduler the core DNS we'd also want to monitor the cubelet process which is basically the equivalent of like C advisor so that's going to expose container level metrics uh then there's also Cube State metrics so this is going to give us cluster level metrics so things uh about like metrics on deployments pods and various kubernetes objects or constructs and finally we'll also want to have a node exporter running on all of our nodes because on a kubernetes environment a node is nothing more than just some Linux server and like any other Linux server that we've covered in our course so far a node exporter is going to expose metrics about that server instance because we do want to monitor the nodes themselves Now by default we are unable to collect cluster level metrics like pods or deployments or services or things of that nature kubernetes just doesn't by default expose those metrics and so to get access to those metrics we have to install the cube State metrics container into our kubernetes environment and that container is going to then be responsible for then advertising those metrics or making it available for our Prometheus server so just something to keep in mind you will have to deploy a container to actually get access to that information now every host should run a node exporter so that it can expose CPU memory and network related stats and there's going to be a couple of ways of doing this we can manually go in and install a node exporter on every node or we could even go in and customize the base image of our node so it contains the node exporter process but I think the better option is since we are making use of kubernetes we can actually use a built-in concept called a kubernetes demon set and so a kubernetes demon set just makes sure that a pod will run on every single node in the cluster and so we could set up a pod that contains the node exporter process and have it automatically run on every single node in our cluster automatically so that it's already running a node exporter process and that way if we add in a new node we don't have to remember to install the node exporter process kubernetes demon set will automatically assign a pod to that node that's going to contain the node exporter process now with kubernetes we're going to be making use of service Discovery so we will access the kubernetes API to discover a list of all of the targets that we need to scrape and that's specifically going to be all of the kubernetes based components all of the different node exporters that we have installed as well as the cube State metrics these will all automatically be discovered by service Discovery so that you don't have to manually go in and enter the endpoint addresses for each one of those to deploy Prometheus we can manually go in and create all of the necessary kubernetes objects so that's going to be all of the various deployments Services config maps and secrets so that we can get everything necessary for a Prometheus instance to work on a kubernetes cluster but this is fairly complex and it requires a lot of configuration out the bat and it's not exactly the easiest solution and instead another thing that we can do is we can actually deploy Prometheus using a Helm chart to deploy the Prometheus operator and that way we can make use of Helm to actually deploy all of the different components of a Prometheus solution onto our kubernetes cluster foreign after each section in this video you'll have Hands-On lab activities that you can perform on code Cloud use the link given in the cards above or the description below to access the labs you don't have to make any payments and it comes absolutely free of cost once you enroll in the labs come back here and resume the video after each section I will let you know when to access which Labs so what is Helm Helm is a package manager for kubernetes and what that really means is that all application and kubernetes configs necessary for a specific application can all be bundled into a package and easily deployed by making use of helm uh and so all of the deployments all of the services all the secrets all of the kubernetes configs that you need for your application to run you could just bundle it up and then all you have to do is then just run Helm install and it's going to install all of those um necessary components to get your application up and running now I want to quickly discuss a concept called Helm charts a Helm chart is a collection of template and yaml files that convert into kubernetes manifest files so what that means is uh when I say that Helm is a tool that we use to package our application what it really means is that it's a tool that's just going to contain all of the necessary configs that we need to deploy onto our kubernetes cluster so this chart or this package is going to have all of the kubernetes configs as well as application specific configs it's going to it's going to keep them all in yaml files and then it's going to also have some templating functionality so that when we deploy our application we can customize the deployment so that if you want to tweak certain settings you can do that by using the templating engine so that's all a Helm chart is it's just the ability to package your application so each application will have its own chart and the great part about Helm charts is that they can be shared with others by uploading a chart to a repository so just like you upload your application code your to some you know like GitHub or git lab and then anybody else can pull your code if they'd like you could do the same thing for Helm charts by using a Helm repository or a chart Repository and this is going to be the specific chart that we use and what this chart is is it's called the cube Prometheus stack and it comes from the Prometheus Dash Community repository but this chart is great because it deploys everything that we need to get Prometheus up and running on a kubernetes cluster and it's also going to deploy alert manager automatically for us as well as the push Gateway so it's really just deploying everything for us with just a simple click of a button and it's going to also deploy a Prometheus operator so as I mentioned the cube Prometheus stack chart makes use of the Prometheus operator and I want to quickly discuss what is an operator when it comes to kubernetes so a kubernetes operator is an application specific controller that extends the K8 API to create configure and manage instances of complex applications like Prometheus so it's going to manage the entire life cycle life cycle of your application it's going to handle the initialization the configuration and the customization of any of your applications like Prometheus in this case so it will actually help install Prometheus and then it will manage Prometheus as Prometheus is up and running in our environment it's going to do a lot of convenient things for us like automatically restarting it whenever we make config changes to our Prometheus configuration and it's going to provide a lot of other helpful abilities uh when it comes to managing our Prometheus instance and if you want to read up on this operator you can go to this link below so the helm chart is actually going to make use of this specific operator keep in mind you can actually run this independently outside of a Helm chart if you wanted to so if you just wanted to play around with it and get a little bit more familiar you can do that as well so with this Prometheus operator we get access to a couple of custom resources that help aid in the deployment image of a Prometheus instance and so you'll see that when we go to customize our Prometheus server we get a high level abstraction over the standard kubernetes API so instead of you know creating a deployment or a stateful set or any of the basic kubernetes objects we don't have to worry about that instead it comes with a built-in Prometheus resource so we could just create it like any other kubernetes object but just call it Prometheus and then we can customize it through that using the API for that resource and it comes with several other custom resources that all act as an abstraction over the standard kubernetes API that allow us to manage the other aspects of a Prometheus installation so we've got one for alert manager that's going to be for configuring and deploying an alert manager we have one for Prometheus rules we have one for the config for an alert manager and then we also have service Monitor and pod monitor those are just uh sources that we use to tell kubernetes about other endpoints that we want to scrape and so with this operator you're going to see that it's going to make life a lot easier when it comes to working with Prometheus because we have access to these high-level abstractions that will make it a lot simpler to modify and manage your Prometheus instance foreign we're going to go over installing Helm and getting our Helm chart deployed so that we can get Prometheus set up and installed and so the first thing that we're going to do is we'll go to the documentation page for Helm and it's going to give you all the instructions that you need to get Helm installed and it's going to go over the different steps for the different operating systems but for a standard Linux machine they have a built-in script where you could just run this following curl command it's going to download The Script and we all we have to do is just change the permission so we can make it an executable and then run the executable and it's going to install Helm automatically for you once you've got Helm installed you'll want to First add the Prometheus Community repo which you can do with this following command and after you add the repo you'll want to update it and now we can go ahead and install our chart so we can run Helm install Prometheus and then the name of the specific chart which is going to be the name of the repo slash and then Cube Dash Prometheus Dash deck and this name Prometheus here this is the release name so you can technically call this whatever you want I just called it Prometheus so if you go to the installing Helm section in the helm documentation this is going to walk you through the different methods for installing Helm onto your machine down below you can see how you can install it through package manager so they've got it for Homebrew if you're running Mac chocolatey for Windows as well as uh apt for Ubuntu and a few other options as well I'm just going to use their default installer script that's going to automatically detect my operating system if especially if you're running on the Linux machine it'll just be fairly straightforward you just copy the script and it's going to do all the work for you so I prefer this method and that's what I recommend but you can use any one of the methods that they have documented on this page and so now it's just a matter of copying in those commands and then just pasting it into our terminal so I got the First Command right here that's going to download the script so if I do an LS we can see the githelm.sh then we want to do a chmod 700. get underscorehelm.sh and that's going to make it executable and then I can just do a DOT slash get helm and after it's installed we can verify that it was successfully installed by doing a Helm version this is going to tell you what version you have installed and if it returns an output like this that means it was successfully installed Okay so we've gone over all of the instructions necessary to add the repo and then be able to actually install the chart on our cluster but if you want to see these instructions laid out you can also take a look at the documentation for the specific Helm chart and I provided a link for this inside the slides and so down here you can see these are the steps that you have to run to get the Helm repository added and then this is the command you need to run to get the Helm chart installed on your cluster so let's go ahead and run those so I'm going to get the Helm repo added and that's going to be the command and you can see that I already have this Helm repo added so it just says it's skipping the instructions you'll see a different output but that's just because I've already run this command then we'll want to do a Helm repo update and once that's complete we can then go ahead and install the chart and to do that all we have to do is a Helm install give it a release name you can call this anything you want I just call it Prometheus and then we do Prometheus Dash community slash Cube Prometheus stack so that's going to install this specific chart onto our kubernetes cluster now before we do that you know when it comes to deploying this chart all charts allow you to customize it and there's different values that you can customize which will tweak how Prometheus is deployed and if you want to see the different options that you have what we can do is we can run the following command we can do Helm show values and then we provide the name of the of this specific chart and this is going to show us the different values that we can customize so if you run this it's going to print it out and there's going to be a lot so what I recommend you do is run this following command and then pipe the output to a file and we can just call this values.yaml and that's going to create a file that we can take a look at and so this is going to contain all of the customization that we can do so you could see you know what name space you want to add the Prometheus chart to uh and there's you know pages and Pages worth of configuration so it's pretty it's documented fairly well so you can just go through it read what each specific configuration option does and then just tweak it to however you like it um for now we're just going to do a default uh deployment with the standard default configs I think the defaults are fairly fine later on in this section we I will show you how we could tweak these so that we can modify the configuration and if you want to deploy using the customize value so like if you went in and changed one of the values in this file then we'd have to change the command and instead of doing a Helm install you know then you know the release name and then the you know whatever the chart name is we would then just make sure to pass in the daf F value and then the name of the file that has the customization so that's going to be values.yaml so that's the only change that you have to make but we're going to use the default so I'm just going to do Helm install Prometheus and then the name of the specific chart and we'll let that run okay so after a little bit it's going to complete and in the next lecture we will go over everything that this chart deployed all of the different kubernetes resources that it created and I'll explain everything that this part a specific chart has actually configured for us out of the gate [Music] okay so we got our Helm chart installed let's take a look at what it actually created so I'm going to do a cube CTL get all this is going to show all of our kubernetes resources and there's a couple of things I want to point out we're going to start off at the bottom and we're going to start off at this stateful set so there's two stateful sets of created for us and if you take a look at this one with the very long name so it's going to give these long names this is our Prometheus server so this is our Prometheus instance and this is going to be the container that's actually running the process but when we connect to our Prometheus server we're connecting to this uh to this container here and then above that we have another stateful set for alert manager and this is going to be the alert manager instance uh now there's a couple of other things I want to point out and I want to go to the deployments so there's a couple of different deployments we've got this Prometheus grafana so we haven't really covered grafana in this course but that's just a um a a graphical UI tool that you can use to visualize the data that you have within your Prometheus time series database and so this Helm chart will actually go ahead and install grafana automatically for you so that you can have it up and running and connect it to your Prometheus server very easily we're not going to focus too much uh on that for this course because this course is just focused on Prometheus but just know that your Helm chart actually does configure it for you now the next thing is the cube Prometheus operator so this is the operator that I said that's going to manage the life cycle of our Prometheus instance it's going to handle you know updating the configs and restarting the process anytime the configs change and a few other things you don't really have to worry too much about what it does just know that this is the container that's responsible for for managing that uh and then you have Cube State metrics as I mentioned when it comes to kubernetes object metrics so things like metrics about um deployments Services pods all of those specific metrics like I said you need to have a container running to actually get that data and then expose it that's what this specific deployment is for and then you know replica sets that's just a subset of a deployment so you have one for the grafana server one for the Prometheus operator and then one for the cube State metrics now above the deployment section we have one Daemon set getting created and it's called node exporter so this demon set is responsible for deploying a node exporter pod on every single node in our cluster and so that's exactly what a demon set does is it just takes a pod and it's going to make sure that every node even if you add a new node to the cluster it will have this specific pod deployed on it and this pod is responsible for collecting host metrics so things like CPU utilization memory utilization information about the file system and it will expose that endpoint so that our Prometheus server can track metrics regarding the the node itself and we can see that there's two uh ready in a in an Upstate and the reason for that is if I do a cube CTL get nodes all right we're going to see that we have two nodes and so that's why we see that now there's two things that I want to just quickly go over before we wrap up this kind of high level overview of all the things that have been installed if we take a look at the Pod section these are going to be all of the pods that have been deployed so we have our Prometheus server so this is going to be the Pod that contains our Prometheus instance we've got our alert manager up here then we have our grafana and then our Prometheus operator our Cube State metrics pod and then we have two node exporters so that's going to be one running on each one of our cluster nodes and then in the service section we're just going to have their respective services for all those pods and just to point out something real quick all of these are cluster IP so right now nothing is kind of exposed to the plot to the public so when we want to be able to access the Prometheus server or the grafana instance we're going to have to set up uh you know either an Ingress or modify one of these services to be a load balancer or we could just set up a proxy so that we could just kind of quickly play around with it but just something to keep in mind now I'm going to just quickly do a cube CTL get State full set and what I want to do is I want to just do a describe on this Prometheus server so that we can take a look at the configurations that the helm chart provided for this specific Prometheus server so I'll just do a uh Cube CTL describe stateful set and then paste that name in and there's a lot of output it can be a little bit hard to read so what I'm going to do is I'm going to run the same command but I'm going to pipe it to a file so we can let vs code do all of the syntax highlighting so it's a little bit easier to read so I'm just going to call this Prometheus .yaml and I'm going to open this file up and we'll take a look at it real quick and so there's a lot of Base configuration but what I want to do is I want to take a look at the containers so you can see here we have one init container and this is going to be an init config reloader so this looks like it's responsible for creating the initial configuration for Prometheus and you can see that it's using a special Prometheus config reloader image and you can take a look at the different arguments but what I'm more interested in is I want to take a look at the Prometheus container itself so here under the container section we can take a look and see that we have the Prometheus right here and we can see that it's using this specific image and we could take a look at the different arguments it has so these are just the standard arguments here we've uh we're passing in a config file we've got the path for the time series database and we also have the path for the console and console libraries and if we scroll down a little bit further we have another container that's running called config reloader so this is going to keep track of any changes that are made to the configs and it's going to be responsible for reloading the Prometheus instance so that it can update its configuration and what we can also do is if you go to the mounts you could take a look at where all of the configs are actually configured within kubernetes so under mounts you see that slash Etsy slash Prometheus config so this is going to be that prometheus.yaml file this is coming from config and so if we take a look at the volumes we have a volume called config and we can see that this comes from a secret and this is the secret name and we also have this slash Etsy slash Prometheus rules so this is going to be the rule files or the list of the rules that Prometheus is going to have and this comes from a config map where if we take a look at this name we could see that that comes from right here and this is going to come from a config map called this and so we could just describe that those secrets and config maps to get a little bit more information on how everything's set up so I'm going to grab this secret right here just so we can take a look and I'll do a cube CTL describe Secret and we can see that we have a secret right here with the prometheus.yaml.gz file here so this is going to be what contains our Prometheus configuration and I'm sure you're wondering well how exactly do we modify this where do we modify it and you'll see that when we're working with the Prometheus operator it's set up a a very nice way to actually be able to modify the Prometheus config so we don't actually have to touch this yaml file we can actually use regular kubernetes manifest to actually generate Prometheus configurations and when we get to that section you'll see just how easy it is so you don't really have to worry about this at all and if we just want to quickly um grab the name of the config map for the rules file I'm going to just grab this one right here and I can do a cube CTL get or describe config map paste that in okay so this looks like it actually stores the rule file itself and what I'm going to do is I'm going to pipe this to a file and I'll call this rule file Dot yaml and we'll take a look at this and we can see that we have the file name and then we can see that this just looks like a standard rule file and there's nothing else to this one in this case it's just a standard config and you'll see that also when when it comes to modifying the rule files the Prometheus operator gives us some high you know some high level abstractions to be able to manipulate the rule file instead of us going in and manually updating this ourselves and I want to take a quick look at the operator configs so if I do a cube CTL get deployment let's just take a look at what this deployment actually looks like so I'll do Cube CTL describe deployment and we'll grab the operator and then I'll pipe this to a file as well and we'll open that up and we'll just take a look so this is going to be just a standard deployment and if we go under containers we can see that we have this container called Cube from previous stack and it's going to be using the Prometheus operator image and it's got a couple of arguments that we've had that it automatically provided us to get it up and running and configured properly but outside of that there's no other configs actually specifically needed for the operator there's only one Mount which is just the TLs certificate but outside of that no other mounts so the operator is pretty straightforward you don't really need to know too much about how it truly works you just have to understand that it's the one that's kind of responsible for managing your Prometheus server and managing the configurations so how exactly do we connect to the Prometheus server how do we access the UI well there's a couple of different ways of doing this if we do a cube CTL get service we can see that we have the Prometheus service running here right here so this is going to provide that connectivity to the actual Prometheus instance and what we want to do is there's a couple of different options right now we can see that it is a cluster IP type service which means that we do not have access to it from outside of the cluster only within the cluster and if I do a cube CTL get and then we'll get this service specifically and I'm just going to pipe this to a file just so we can take a look at it if I grab the service and we can see that from the type section down here we can see it's set to Cluster IP so you could just go ahead and update this and use one of the other different services like a node port or a load balancer so that you can access the Prometheus server from outside the cluster or if you wanted to you could also set up an Ingress and Route the traffic for a specific URL or domain to this specific service and that's going to allow you to have outside connectivity to the Prometheus server and so whichever method you want just go ahead and choose your preferred method but for the sake of this demo I'm just going to do port forwarding so that we can just take a look and just verify that everything is in fact running properly and we could do that pretty easily so we have the service and I'm going to do a cube CTL get pod and so now what we do is we do Cube CTL Port Dash forward and then we grab the specific pod and then the port that we want to connect to and so if you take a look at the Prometheus service we could see that it's listening on Port 9090 so just pass in Port 9090. and so traffic destined to your local host on Port 9090 is going to get routed to your Prometheus server so if we go to this specific address in our browser we should be able to route to our Prometheus server and you can see that we do in fact have access to our Prometheus server now but like I said this isn't meant for as a long-term solution so if you need connectivity from outside the cluster you're going to have to either update the service to be either node port or load balancer or set up an Ingress to Route traffic to that specific service that we have for the Prometheus server okay so what I want to do now is uh now that we can connect to our Prometheus server I want to take a look at the configs that were installed by default and before we do that I want to just quickly go over the kubernetes service Discovery functionality because kubernetes does in fact have service Discovery as well so we're able to dynamically learn the different targets that we want to scrape it but there's a couple of different options that we have when it comes to service Discovery and if we take a look at the Prometheus configs you can see that one for one we have node so for the node service Discovery this is going to discover all nodes within a kubernetes cluster and then it'll also show you all of the meta labels that come with it we can also discover all all the services in their respective ports we can take a look at the Pod service Discovery we're just going to discover all pods as well but at the end we've got this endpoint service Discovery so this says that the endpoints role discovers targets from listed endpoints of a service uh and so this is probably the most uh flexible of the service Discovery we can basically discover pods Services nodes and everything else using the endpoints um because everything is going to have an endpoint or or a service assigned to it and so really just think of an endpoint as just an IP address and a port tied to some resource it doesn't matter what that resource is it could be a pod a service it could be a node it could be really anything so everything technically has an end point um and so what we can do with the endpoints is we can just get everything we can get all of the pods we can get all of the nodes we can get all of the services we can get everything in our cluster and then we can set up filters so that we can only um you know scrape targets that match certain labels so that's how we're going to be able to you know get a list of all the nodes get a list of all the services using the endpoints service Discovery and I wanted to point this out because in the configs the default config exclusively use endpoint so I don't want you guys to be kind of confused by that the endpoints endpoint Point gives you access to everything so it's the most flexible one okay so with the Prometheus server up or the web UI up let's go to status and configuration and we can take a look at the configs that were automatically generated and Let's ignore the global I mean we're already familiar with that instead I want to focus on the scrape config so everything else doesn't really matter too much let's just go down to the scrape configs and so here we've got a job name and so we can see it says Service monitor default Prometheus Cube Prometheus alert manager so this is uh the scrape config for scraping the alert manager and you'll see there's a whole bunch of configs and I want to start off actually at the bottom of the config where we can go to the kubernetes service Discovery and you can see here we use the roll endpoints that I mentioned and that's all we have to do so this is going to get all of the endpoints in our in our kubernetes cluster we don't really have to worry about anything else here and the way we're able to get specifically the alert manager uh Target and not you know accidentally scrape other things is that if you take a look at the relabel configs you could see all of the labels that we're going to actually keep and scrape and specifically if you go to this line right here this is going to look for this specific label um meta on kubernetes service label app and then also metacubernetes service label present app and then we have a regex that's going to match Cube Prometheus stack alert manager so this is what's going to allow us to figure out which endpoint is meant for our alert manager and we can see that we have a keep here so we're going to keep this so we're going to scrape this alert manager instance and then we've got several other ones um you know at the top here we can see that if the we have the source label and here we're going to change the target label to be temp to meet these up so this is just relabeling it um as a temp for uh I'm not exactly sure why they're specifically doing this but I just want to quickly just go over all of the the other configs so here we've got meta kubernetes service label release and meta kubernetes service label present release and we can see that it's set to Prometheus and true and so the action is keep so this is that label that we kind of um we'll we'll discuss this in a bit but basically everything that Prometheus should be scraping and monitoring there's going to be this label that we expect called Prometheus and that's going to determine if we should be scraping it or not and so that's why we're keeping that and it's just going to go through all of the different labels of all of our various um endpoints and just figure out which ones represent the alert manager and that's going to be all of these rules here and if we keep going down below that we've got the kubernetes API service so this is going to scrape the API server itself and this is going to have the same exact SD configs and then it's going to have a bunch of relabel configs so that we can match on just the API endpoint and we can collect those metrics and then we'll uh we'll collect those targets and metrics and then we'll drop everything else and if we keep going down we've got the one for core DNS uh it's still once again no different than the other ones it's just a matter of getting the labels to match so that we can grab the Accord DNS endpoint and let's see what else we got we got the cube controller manager we've got cube SCD Q proxy and so it's just going to keep going down the list of all of the various different endpoints that we want to scrape and if we actually just go up we can take a look at the service Discovery and we can see everything that it's discovered so we've got our alert manager we've got cubelet 0 cubelet one cubelet two we've got our operator our Prometheus server Cube State metrics our node exporter API server and core DNS and if we go to targets we should be able to see that all of them are in an Upstate and we're able to scrape them successfully so all of this was set up out of the gate for us using the helm tread so we didn't have to manually go in and Define all of those relabel configs so that we can um get access to the specific targets we want to scrape on a per job basis [Music] Okay so we've successfully gotten our Prometheus server set up to monitor our kubernetes infrastructure we've got all the endpoints set up and so now I want to show you guys how to configure Prometheus when it's installed using the Prometheus operator to monitor an application that's running on your kubernetes infrastructure so I have a dummy node.js application that we're going to use for demonstration purposes and this is just making use of a a library called Express which is going to allow us to create an API and it's a simple API where we can send requests to either slash comments slash threads or slash replies and it's going to just send some dummy data back and a few other things to make note of when it comes to this application it's going to be listening on Port 3000 so when we set up our service we want to make sure it forwards traffic to 4 3000 and more importantly I've set up a Prometheus metrics to run on here so you can access the Prometheus metrics at the following URL so slash Swagger Dash stats slash metrics so that's the specific URL our Prometheus server should be scraping to to get metrics data from our application so what I'm going to do is I'm going to containerize this application I'm going to push it up to Docker Hub and then we're going to create all the kubernetes manifests so that we can get our application successfully running on our kubernetes infrastructure now I already have a Docker file set up so we've got a simple Docker file that's going to allow us to create a image from our application and I'm going to go ahead and just do a Docker build now so I'll do Docker build I'm going to give it a specific tag so this is going to be my Docker Hub username and I'm going to call this Prometheus Dash demo right make sure I do sudo okay so we successfully built the image and so now I'm going to do a Docker push for that same exact image okay and so now we have successfully pushed our image up to Docker Hub so now let's go ahead and set up our kubernetes config so that we can actually deploy our application onto our kubernetes cluster okay so now let's create the kubernetes Manifest to deploy our application onto our cluster so I'm going to create a new file I'm just going to call this API depot.yaml so this is going to contain our deployment configurations as well as our service configuration and what I'm going to do is I'm going to just copy and paste the configs in here that way you don't have to watch me type it out and we'll go over this line by line just to make sure that we're all on the same page so here I've got a um a configuration where I have a deployment I'm going to create a deployment here I'm giving it some arbitrary name I just called it API deployment and we've got a label under spec we have replica so this is going to create two different instances of our app and then we've got the selector match labels this is going to be set to app API and this needs to match up with the label over here and then down below under the spec where we have the list of containers I have one container I've just called it some arbitrary name of API and then I have the image which is the name that I use to push it up to Docker Hub and then I have the container Port which is set to 3000 which is the port that the container actually listens on so that's going to create the deployment the next thing that we want to do is create air our service so we can handle communication and you can use any kind of service that you want it doesn't really matter I'm just going to use a cluster IP so technically it isn't accessible from the outside world but that's okay so here I've got a service I'm giving it some arbitrary name and I've got a couple of arbitrary labels and then I have the type to be cluster IP and then the selector so this is going to select which pods it's going to work for so it's going to be app API so this has to match up with app API over here and then for ports I've got one port I've called it web it's going to be protocol TCP it's gonna afford it's going to forward Port 3000 to the Target Port of the container of three thousand and once again 3000 is going to be the port that the container is listening on or the app is listening on so those are all the configs I need I'm going to deploy this and I can do a cube CTL apply Dash F and then we can just call the API double.yaml file and it looks like it created the deployment and the service and we can just double check that real quick so I'll do Cube CTL get service and we can see our API service and I'll do a cube CTL get deployment and we can see that it's up there's one of two ready so let's run this again maybe the second one will have come up by then and we can see both are up and are working so it looks like everything was successfully deployed and so in the next video we're going to take a look at how we can now set up Prometheus to scrape this application [Music] so we got our application deployed onto kubernetes uh now it's time to set up the Prometheus configuration so that it's aware of these new targets and there's two different ways of doing this and I don't want to say one way is the wrong way and one way is the right way I would say that one way is the less ideal weight and the other way is the more optimal way that the Prometheus operator would like you to configure it so I'm going to show you the the less preferred way first we're not going to run through all of the steps I just want to give you a high level overview of what to change and it's pretty straightforward and then in the next video we'll take a look at how we can make use of service monitors which is the more ideal way of adding new targets to Prometheus and so when we first covered Helm I mentioned that we have the values file which allows us to tweak the configuration of your Helm chart and so what we want to do is we want to do Helm show values again and this is going to be the name of the repo we have to pass in so this is going to be Prometheus Dash community slash Cube Prometheus stack and then we want to pipe this into a values.yaml file so this will show you all of the different things that we can configure and modify and in here what we want to do is we want to search for something called additional scrape config so if you search for additional scrape you'll see a section here called additional scrape configs and it's going to go over what it means so the documentation is actually pretty good and so here the additional scrape configs allow specifying additional Prometheus scrape configurations so that's pretty much exactly what we're looking for but it does give you a couple of disclaimers it says that scrape configs are appended the user is responsible to make sure it's valid it's not going to perform any kind of validation for you and note that using this feature may expose the possibility to break upgrades of Prometheus and so that's why I said this is the less ideal way of doing things it'll work but you know could potentially lead to some issues so all you have to do is under this section right here you can see additional scrape configs all you have to do is just provide a list of jobs and so this has an example job right here and you'll notice the configs are exactly the same as a Prometheus configuration it's just like configuring a job on your Prometheus server no different so you're just basically passing a list of new jobs and so you could just uncomment this remove this and then provide all your jobs so this is just one job in this case and then after you have that what we can do is do a Helm upgrade so we would do Helm upgrade and then the name of the release which is Prometheus then the name of the chart and I'm just going to copy the name from the chart from right here and then you want to do a dash F which is to pass in a file and you want to pass in the values.yaml file so this is going to contain the the configuration and then the upgrade is going to make any of the changes that are necessary from our previous uh deployment and that should update the configuration and then Prometheus should now be aware of the new targets and like I said this is the less ideal way and so in the next video we're going to cover how to use service monitors so that we can more declaratively apply new scrape configs for Prometheus thank you so now let's move on to seeing how we can add targets to the scrape list using the service monitor method and so first I want to talk about crds or custom resource definitions so the Prometheus operator that we have running comes with several custom resource definitions that provide a high level abstraction for deploying Prometheus as well as configuring it and we could take a look at all of the crds by running the command cubectl get crd and so there's several different crds we could see at the top we've got one for alert manager config we've got one for alert manager but the two that we want to focus on are Prometheus so this one is used to actually create a Prometheus instance so you could provide various details regarding how you want it configured when you deploy it and then specifically for this video we have the service monitor crd so this is what we're going to use or create so that we can add additional targets for Prometheus to scrape and a crd is going to act like any other Prometheus object we're going to configure it like any other kubernetes resource and I'm going to show you how easy it is to get a new set of targets added to your scrape list using the service monitors so what is a service monitor a service monitor defines a set of targets for Prometheus to Monitor and scrape and so what it does is it allows you to avoid having to touch the Prometheus configs directly and it gives you a declarative kubernetes syntax to Define targets so instead of modifying the Prometheus configs what you're doing is actually creating kubernetes resources that update the list of targets that you should be scraping so it's going to look like you're creating regular kubernetes resources you won't really even be able to tell that you're touching the Prometheus configs so we had previously deployed a our app and we deployed a deployment as well as a service this is what our service looks like now a service monitor is going to reference a service someplace in our application so if we want to scrape our API that we created we're going to have to reference the service that we created for our API and we're going to create a service Monitor and if you take a look at it it's basically like any other kubernetes resource under kind we just set it to service monitor so it's our own custom resource that we defined for our application and there's a couple of things that I want to point out first of all we've got the name right here this is just some arbitrary name you can name it then we've got a couple of labels these do matter but we'll cover that in a bit but really what we want to focus on is this endpoint section so here we provide some information as to you know what is the scrape interval so this is going to scrape a specific Target every 30 seconds and then the path so you know by default it's going to be slash metrics but I told you our application is going to be running on slash Swagger Dash stats metric so that's why you see that like this now how do we actually get our service monitor to point to a specific service that our application is using it's very simple there's a couple of things first of all under the selector match labels you're going to provide a label which is going to be the label that you gave your service so in this case app service API happens to match up with app service API of our service so it tells it what service we're referencing and then we have to tell it which specific Port that we're going to forward um or or which specific Port we're going to scrape and so in this case we set Port web and that's going to match up with the name that we gave it here which happens to be web and so that's going to tell it to scrape Port 3000 in this case because it's referencing this specific port and then finally um by default the job label for this um for this target is going to be the name of the service so it's going to default to API service if you want to customize it you can add this job label property so the job label tells uh Prometheus that we should look for a label called job on the service and then whatever value of that label is that's going to be the name of our job so in this case we can see its job here so they match up here job job and then the value here is going to be the job name so the job name is going to be node Dash API if I had changed this job label to be my label then I would have to change the job to be my label and then whatever value here is going to be the new job name and so that's all you have to do to configure a service monitor you just specify the selector as well as the port and then any other extra configuration when it comes to the interval as well as the path and you can add an optional job label property as well and that's going to let Prometheus know that it should scrape whichever Target is behind the specific service now there's one thing we have to add so if I do Cube CTL get and then the name of the crd for Prometheus not the one for the service Monitor and we do Dash oh yeah well we can see the configs there's one property that I want to go over and that's this service monitor selector and so it provides a label here in this case it's release Prometheus and so this label allows Prometheus to find service monitors in the cluster and register them so that it can start scraping the application the service monitor is pointing to and so by default Prometheus doesn't know which service monitors to look for but if you give a service monitor a this specific label it knows to automatically add that to the Target list and so if we go to the configs of our service monitor we need to make sure that same label is under the label section and then Prometheus can dynamically add that so that's the final requirement okay so let's take a look at our crd so if I do Cube CTL get crd we should see a list of our crds and there's two main crds we want to focus on like I said the Prometheus one as well as the service monitor one so the Prometheus one this is going to actually be responsible for creating the Prometheus instance and the service monitors are there so that we can add extra targets and jobs and we're going to do a cube CTL get crd or get and then we'll pass in the name of Prometheus and I'm going to do Dash o yaml let's just make sure that we check to see what that selector label is and so if I go up to service monitor selector we can see that the match labels is set to release Prometheus so when we create a service monitor we want to make sure that this label is present and by the way the label that the helm chart added this is completely customizable by updating the values.yaml file so if you ever wanted to change that you can just modify that in the chart so we have that so what we're going to do is we're going to go to our API depot.yaml and we're going to go ahead and add our service monitor at the bottom and I'm going to paste this in again and it's going to be basically the same thing that we saw in these slides so I've got it gave it a name of API service monitor I've got the release Prometheus so we need that so that Prometheus can dynamically discover this service monitor I'm going to give this uh the job label is going to be set to a value of job so if I go up to my service job is set to node API so this is going to be the job label endpoints this is going to have 30 second scrape interval with this following path and it's going to point to a port of web and if we can see web is going to go to this specific one right here and then match labels api-service or app API service is going to match up with app API and I realized uh this is just called API so I'm just going to change this to API so remember they got a match okay and that's all we need so I'm going to pull up the terminal again and I'm going to do a cube CTL apply Dash f API Dash depot.yaml and it's now successfully created a service Monitor and if you wanted to take a look at this we could do Cube CTL get service monitor and we can see all of the service monitors and so there's some pre-existing ones that are Helm chart created so that's why we have uh all of those targets that are already being scraped we already have all of these defined the helm tried that this is going to be the one that we just created okay so now that we have that configured let's verify if Prometheus is actually scraping that so I'm going to go to status and I'm going to go to targets and we could see that right at the top we could see our API service Dash Monitor and so we can take a look and we can see that there's two here and that's because we have the replica count was set to two in our deployment so it makes sense that there's two of them and on top of that we can take a look at the job we can see it got properly set to node API and if we go back to the main Prometheus what we can do is just do a quick query I'm going to do job equals node API let's just verify if we're actually getting metrics and we can see that we are in fact successfully getting metrics so we've now configured um a new set of Targets in our Prometheus instance using service monitors and if you want to take a look at the uh the final configs it actually generates on the Prometheus side we can go there as well so if we go to configuration and then we can search for api-service we could see the final configs that were created and that's going to be set right here and so you can see that it looks like under meta kubernetes service label app we can see that the value is set to API and it looks like we're going to keep that and that looks like that's where it's actually matching um the specific service in in this service Discovery actually and you can actually see the um the label here kubernetes endpoint Port name we can see that it's set to web so all of the configs that we did in the service monitor they're all lining up with the relabel configs for this specific scrape job so we went over how we can add new targets using service monitors let's now take a look at how we can add rules now to add rules the Prometheus operator has a crd called Prometheus rules which handles registering new rules to a Prometheus instance and so the way this is going to look like is it's going to be pretty similar to the service monitor except the kind is going to be a Prometheus Rule and then when we go under spec we just configure our rules like we normally would inside a prometheus.yaml file so you can see we have a group section and then we provide the name of the group and then we specify all of the rules so that's all it is it's just a standard yaml file with a standard kubernetes resource called Prometheus rules and then you just specify all of your groups and rules like you normally would nothing else is really any different one other thing to point out if we do a cube CTL get Prometheus we can see that there is this rule selector of property in the output and so what this does is very similar to the service monitor selector but this label allows Prometheus to find the Prometheus rules in the cluster and register to them dynamically so we will have to add that same specific label to our configs just to make sure that our Prometheus instance is able to find them so let's add some rules to our um deployment and so I'm going to create a new file I'm just going to call this rules.yaml and I'm just going to paste the same configs that we had from the slides and we'll just quickly go over them so we've got uh one group called API and then there's just one rule which just checks to see if a node or instance is down but most importantly under the labels we can see that we have our release Prometheus so Prometheus should dynamically detect this and register this rule and I'm going to do a cube CTL apply Dash f rules.yaml and I realized I forgot to save it so now it's successfully created and we can do a cube CTL get Prometheus rule just to see if it got created and we can see all of the pre-existing rules but we want to find our API Rule and we can see that it was created 52 seconds ago okay so now if we go to the Prometheus UI and we go to status and rules let's take a look and we can see our group API here and we can see that we've got our rules successfully registered and in an okay State and so that's all you have to do to register a rule all you have to do is just create a new Prometheus rule object and then specify all of your rules and that's all you have to do [Music] okay so to add alert manager rules the Prometheus operator has another crd called alert manager config which handles registering new rules to the alert manager and so this is an example config and the alert manager config is also pretty straightforward we have the kind set to alert manager config and then if you take a look at the configuration under spec this is just going to be standard alert manager config so you've got your route your group by your group weight and then you have your receiver here and then your list of receivers so it's actually no different than configuring a regular alert manager rule now one thing to point out is that once again if we do a cube CTL get alert managers and then take a look at the config there's this one property called alert manager config selector so just like we saw with Prometheus when we're setting up a rule or a service monitor this label allows alert manager to find alert manager config objects in the cluster and register them but the important distinction here is that the helm chart by default does not specify a label so we don't actually have a label that we're supposed to specify on our alert manager config so that it can be registered so we'll actually have to go into the chart and update this value so that we have a label that we could use now when it comes to the standard alertmanager.yaml configuration file that you have on a alert manager instance that isn't deployed on kubernetes it's going to be a little bit different than the alert manager config for the custom resource that's defined within kubernetes using the Prometheus operator and I want to highlight what some of the differences are because I got stuck on this for a little bit and I want to make sure that you guys don't waste any unnecessary time trying to figure out what are the errors so the first thing that I want to point out is when you're defining configurations like group by or group weight or group interval you're going to see that on the standard alertmanager.yaml configuration it's going to make use of snake case in snake cases whenever you have a property with multiple words like group weight or group interval the two words are group and weight you're going to separate them by a underscore that's called snake case now when we move to the alert manager config file it's going to be a little bit different instead of using snake case we're going to make use of camel case and so with camel case the two words are not separated by an underscore instead they're combined together as one word but you capitalize the first letter of the second word so group weight is group and then capital W weight and the same thing goes for all the other properties and the only other difference that I want to point out is when you're setting up a matcher let's say that you have a job label with the value of kubernetes when you move to the alert manager configuration on kubernetes the difference is going to be that instead of just specifying job kubernetes as your label you have to split it up so that you specify the name of the label which is going to be job and then you have to specify the value of the label which is going to be kubernetes so instead of just putting in job kubernetes you have to say name is job and then values kubernetes now to upgrade the helm chart the first thing that we want to do is we want to get all the value for the helm chart so we'll do a Helm show values then we do the chart name and then we'll pipe it to a file called values.yaml and then we want to update this property called alert manager config selector and then we pass in a property of match labels and then here you're going to specify your labels you can use any label you want I just use the same one that we've been using which is resource Prometheus but keep in mind it doesn't have to be the same one and then we're going to run the command Helm upgrade then the release name then the chart name and then we have to pass in the dash F flag for values.yaml so that's going to update the chart and the configs will get updated accordingly okay so let's take a look at the crd so uh so I'm going to do Cube CTL get crd and we're going to just take a look at the alert manager crd and I'll do Cube CTL get and I'll do a dash o yaml so we could take a look at the configs and like we saw in the slides the thing that we want to verify is the alert manager config selector has nothing set so this is where we're going to have to update the chart so to update the chart we're going to do a Helm show values and I'm going to just copy the chart name because it's fairly long and then we want to pipe the output of this to a file called values.yaml so we can update it Okay so we've got our values.yaml and what we want to do is find that property so there's a alert manager config selector so this is going to be right here I'm going to delete these two curly braces and I'm going to say match labels and then we'll do resource Prometheus like I said this can be any value you want but I'm just going to use the same ones make sure to save it and then now we can do a Helm upgrade so I'll do Helm upgrade we do the release name and then the chart name and then we have to pass in the dash F flag so we can pass in the values.yaml okay so the helm chart has been upgraded so let's actually go and run that same command where we check where we take a look at the alert manager config and if we take a look at the alert manager config selector we can see that it was successfully updated and so now we can actually create a rules file for our alert manager so I'm just going to create a new file I'll call this alert Dot yaml and I'm going to paste that example config and so from this configuration um the only thing that we really need to focus on is the label so I've got the resource Prometheus label so it should automatically get picked up and uh in this case we're grouping all of our alerts by severity a label called severity and we've got one receiver called webhook and so I'm going to save this and now we can just do a cube CTL apply Dash f and then alert date yaml okay and now we've successfully uh created that alert manager config and we can verify that by doing Cube CTL get alert manager config and we can see our alert Dash config that we just created okay so now to verify that everything worked let's get access to the alert manager so we're going to set up port forwarding again so we'll do Cube CTL get services and we'll do Cube CTL Port Dash forward service slash alert manager operator that's the one we want and it's going to be listening on Port 1993 and after it's listening on Port 1993 then we're going to access it and we'll just double check the configs just to verify that everything looks good and so if we go to this status page here we can take a look at the configs and we can see my web hook alert config webhook example and we can see all the configs there and then we can see the receivers for this as well down here so it looks like it worked and that's all we have to do to add configs or rules to an alert manager all right guys so that's going to wrap things up for today uh if you want to learn more about Prometheus head on over to codecloud.com and you can take my Prometheus course um this course starts from the absolute Basics so if there's anything that was a little confusing or if you didn't understand this course will start from the absolute beginning I will go over what is Prometheus why we need it what is the purpose of this monitoring tool how does it fit within the overall observability stack and then we'll take a look at how to get it deployed on a uh you know a VM first we'll start off with the absolute basic scenarios and then we'll move on to more complex scenarios uh just like this scenario where we take a look at how to deploy it on a kubernetes cluster and we've got plenty of other sections where we go over the different features like prom ql the different built-in dashboarding and visualization solutions that come with Prometheus and just like with any other code Cloud course there's going to be plenty of Hands-On left so every section is going to have at least one Hands-On lab where we allow you to practice all of the things that you just learned during the lectures so that you get a chance to really reinforce all of the ideas that you've learned and I think that's just a better way of learning so if you guys are interested in this definitely take a look at the course and at the end of the course we also have a couple of mock exams if you guys decide to ultimately pursue the Prometheus certified associate exam this will help you prepare for that as well and so I'll see you guys in the next one
Info
Channel: KodeKloud
Views: 44,550
Rating: undefined out of 5
Keywords: kubernetes, prometheus, devops, kubernetes monitoring with prometheus, prometheus monitoring kubernetes, prometheus monitoring tutorial, kubernetes monitoring with prometheus and grafana, kubernetes monitoring, prometheus monitoring tutorial for beginners, kubernetes monitoring prometheus grafana, kubernetes monitoring and alerting, kubernetes monitoring best practices, What is helm, Helm charts, Prometheus chart, Install Helm, install prometheus chart, KodeKloud
Id: 6xmWr7p5TE0
Channel Id: undefined
Length: 68min 55sec (4135 seconds)
Published: Wed Mar 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.