Server Monitoring // Prometheus and Grafana Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone christian here and in this video i want to show you my new monitoring solution with precious and grafana that i've recently deployed on my home server you can monitor several different systems and components with that like showing cpu memory disk or network utilization create an excellent dashboard you can fully customize to your needs and collect all your server or application metrics as well it's incredible and i'm going to show you how to set up this but first if you're managing some service in your home lab or production environments and you want to authenticate to them securely then take a look at the sponsor of this video teleport teleport is an open source access proxy to securely manage ssh sessions web applications databases and even kubernetes clusters and every login is protected with the two-factor authentication in an audit logging to record user sessions and you log users actions you can install the free community edition completely self-hosted at no cost so don't let it try it out and suppose you want to use teleport within your business environment in that case they also offer an enterprise version with 24 7 support or active directory integration and much more features so just reach out to the teleport team so if you are like me and you're hosting a lot of stuff in your own home lab or maybe you have some production servers running somewhere and you want to monitor them you know that this is often a big challenge because when you have many different applications services and components you want to monitor you often need to log into multiple different web uis where you need to switch around other systems and having a centralized monitoring system is so much better because you can quickly identify the health of all of your systems in servers and get notified when certain thresholds are reached or track any errors and you have everything accessible in one single ui that you can fully customize and you get the overall visibility of the entire environment and that's important for every assist or cloud admin and even in your homeland because how should you know when something is going wrong if you're not monitoring it so that's really what server monitoring is all about knowing what's going on and identify bottlenecks in resources it's also important to understand that we have two different monitoring techniques because we differentiate between locks and metrics unlocks come in various forms on linux server for example you will find many records primarily located in the var log directory applications will write their events and output to log files there for example when you are running a web server and someone sends a new web request to it the application will write in your entry in a log file with the ip address the url and so on so logs are tracking events with specific details about the nature and type of it so if something suspicious happens on your system it's always a good idea to take a look at the log file to recap what's happening in that particular time metrics are a bit different and they also come in different variations so one metric for example is a counter so every time someone sends a new web request we would increase a counter for that and then we can calculate for example how many web requests we get in a specific time frame so metrics are more about thresholds numbers and statistics such as cpu and memory utilization or a counter for the disk read and write operations for example if you see a low performance within your server's hard disk you might just take a look at the metrics and see if it gets too many requests in a short time and you might need to upgrade your hard drive because you've reached a bottleneck there without metrics so how should you know that as i said in the beginning you probably don't want to do that all manually for every single application or every server you're running having a centralized server where you're collecting and aggregating all this information this is exactly what we're going to do with prometheus so prometheus is an open source monitoring system that stores all these metrics in a giant database and you can use it to scrape different systems and collect server or application metrics but let's take a closer look at its architecture and how the system really works first we need to deploy the prometheus server which pulls the metrics from different targets so you're configuring all systems and services you want to collect metrics from inside the prometheus configuration file and then prometheus will initiate the connection to all configure targets and scrape the metrics in a specific interval and these targets are not necessarily physical or virtual servers because you might want to collect different metrics for different applications running on the same server so therefore a target can represent a physical device a virtual server like a linux machine for example or also just docker daemons kubernetes clusters and much more prometheus offers a simple interface to query and read this data but it's mainly there for collecting and storing it not visualizing it so therefore we're also going to deploy another application that is called grafana and grafana is a web ui that queries the metrics from our prometheus server using a specific query language that is called the promql and you can then log into the grafana ui and visualize the information in a friendly dashboard it's also interesting that prometheus has an alert system with their own alert manager which can send you notifications in case of a specific event but that's probably a topic for another video because i don't want to make it too complicated in this one but don't worry i will make more videos about server monitoring and also server logging in the future okay so that's enough for the presentation i think so let's get to work and let's deploy this stuff in our home lab server so you can go and install that directly on a linux operating system or somewhere else but of course i'm also installing that with docker and portainer that i basically just use in most of my videos so if you haven't watched videos about containerization with docker and porting a web ui then you should definitely check that out on my youtube channel i will put your link to the right playlist in the description down below and in the portina web ui you can simply just start deploying prometheus and grafana in some of the other exporters that we need to collect the metrics later here and you could just go to the official documentation just go and find their darker instructions here but of course i already created some nice templates for you guys so that you don't need to do that yourself so you can just go to my personal github page you will also find in the description and here i usually share some templates dot files and other resources with you and our community and you can just go to my repository that is called boilerplates so in this repository i usually share templates and configuration deployments for various projects here such as ansible playbooks docker compose kubernetes and vagrant files and those kind of things and in the docker compose stack you will find a bunch of different applications and templates for it such as prometheus and graphana so here you can just go and clone this repository or copy this to your server and use docker compose to deploy it or you simply can just open the docker compose file and copy the content and go to your portena web ui where you can deploy a new stack so just select your portena server here go to stacks and add a new stack here and call it something like monitoring now we need to paste the content of the docker compose file in here but i also want to show you step by step what we're actually doing here so first of all we will set the version here to version three and define two volumes here to store the data because prometheus and grafana both need volumes and persistent data drafana needs to store some configuration files and some of the dashboard configurations and prometheus obviously need to store all the metrics somewhere so this is a database where all the metrics that we are collecting are stored and you can read about the basic data retention because this is something i asked myself so how long is prometheus actually retaining this data and in the documentation of prometheus you can see that the default retention time is set to 15 days so anything that is beyond 15 days will be removed automatically and usually there is not much usage for going back any further so for example if you're monitoring a server and you have some kind of problem you probably don't need to go back like 30 or 40 days it's mostly important to know or compare the metrics how it was an hour ago or maybe one or two days ago there are very few reasons why you would want to archive those metrics and go beyond that 15 days retention but in case you do you can actually use this a command or this parameter here to extend the retention time so back to our docker compose file here so we are just defining those two volumes to store the data persistently and now we start with the first container that is called prometheus and we are using the latest prometheus version from the official repository and set the container name to prometheus we are also exposing the port 1990 which is a default port this is not encrypted at all so this is something you need to take in consideration don't expose it to the public internet because it has no authentication and no encryption on that part so this is something you probably only want to do in a in a very uh protected local network or even just in an isolated docker network and don't want to expose this somewhere else right or you might also consider protecting this with access control servers or access proxies like teleport or use reverse proxies like the nginx proxy manager or traffic so this is also something i cover in other videos of my channel so if you want to know how to securely expose any web applications take a look at the stuff on my youtube channel and so now we need two volumes here for prometheus because we also need to create a configuration file where we configure which targets a prometheus should try to establish a connection to and collect the metrics from and these targets are configured in the prometheus.yml file we simply put in the etc prometheus directory on our host operating system and we will mount that into the config directory inside the container and use this as a target we also want to use the prometheus data to store any data persistently as i said this is a database where all the metrics are stored and collected now we also want to create a second container for the web ui the graphana where we are visualizing this data so we want to use the image from grafana the latest version set the container name and expose the port 3000. so this part is also unencrypted but it has a login mechanism so you also may want to protect this with ssl certs in a reverse proxy and we want to store all the configuration in the dashboard inside the grafana data volume here okay so this is fine we can simply just deploy this stack now and let's try to do that but it probably won't work yet because we haven't created this configuration file for prometheus i just want to show you what will happen if you deploy it without that configuration file if we go inside the logs of prometheus here you can see that it failed to open the prometheus.yaml so we need to first of all need to create this configuration file that previous can operate and in the getting started guide of prometheus you will see a template here or you simply can just go to my github repository and go back to the boilerplate section i already created a template previous dml configuration file that also includes some exporters that i'm using in my test setup here so let's go to the terminal of our home server and first of all create a directory inside the etc folder that's called prometheus and of course i also need to do that with root users privileges i always forget this stuff and then i also want to create a new configuration file inside this folder here that is called prometheus.yaml so let's open this in wim and basically just paste the content of my template in here and just leave it like this here i will walk you quickly through that it's not too complicated here so first we you can define some global variables like the scrap interval so every 15 seconds this is a default time where prometheus tries to establish a connection to all the different targets we are configuring in the scrape configs here so we want to add one job first and configure the target localhost on 1990 which is the prometheus interface itself so it's it also can scrape its own metrics here and use this as an example to show you at least something right so let's write and exit this file and we also need to go back to portena and restart the container so let's go here to containers and restart prometheus so now inside the logs you should see that everything is fine we know should be able to open this web ui so we just enter the ip and the port and you should be able to reach the prometheus interface and you can see there's no login there's no encryption anywhere so make sure that you are protecting this securely and first of all we can also get some information about the server health status so the tsdb status is the database and you can see that it's already pulling some metrics here we can also see all the targets that are configured and if they are up and healthy so if you configure multiple jobs inside this configuration file here for example we will do that later here we will also see them inside the targets here and see if they are up and if prometheus can connect to them but i want to show you how to query some metrics inside prometheus because in the main page of prometheus we can also put in an expression if you're familiar with the promptql language which i am not to be honest so i i'm just the beginner with this stuff but if you click on that icon here you can see all the different metrics that prometheus has stored inside the database and you can see there's already a lot of stuff in here which looks really intimidating right because if you don't know what all these metrics are you never find the stuff that you're looking for but don't worry about this i will show you how to visualize all this data with grafana because this is what prometheus isn't actually supposed to do it's not supposed to to get the metrics it's just a storing and collecting them okay so these are some basic example metrics but now we you probably want to scrape a metrics for a linux server or maybe you want to scrape your data of a docker stack or something like this here so this is how it works in prometheus because prometheus has a bunch of different exporters and integrations and if you go to the official homepage to export us an integration you can see there are a lot of third-party exporters even mentioned on their home page and you can see there are so many different exporters for databases for example if you want to collect some metrics for your mysql database you can just use the mysql server exporter there are also some hardware stuff here so you can see that for fortigate firewalls there is actually an exporter which is pretty cool and you can scroll through this stuff what you can monitor with this system it's very versatile it's very flexible and you could even just write your own exporter if you have a custom system that you've developed and you want to monitor and i want to show you two main exporters that i'm using in my home lab to scrape some interesting metrics for example some of my virtual servers that i'm running with a linux operating system and of course my docker stack so the first exporter i want to show you is called the node exporter the node system metrics exporter one of the official systems that can scrape metrics from servers they are not recommending it to deploy as a darker container but it's actually the first part what they are telling you to do so i don't know why they are not recommending it it will definitely work the main problem here is that those kind of exporters which want to connect some of the resource metrics like cpu memory and those kind of things always require privileges to access their specific files inside the linux virtual file system and that requires you to mount the root directory inside the darker container which probably some people are afraid of but you can see that you can also set this to read only so that the container itself couldn't change any information it's just about reading this stuff because otherwise it just couldn't collect the metrics from the host os right so this is something we need to do you can simply use this as a docker compose example or you will again find this in my boilerplates so you will find a folder for exporters here where i add some templates here and i probably will add more of this stuff as i'm monitoring more systems in my home lab i will expand this so in the node exporter you can see a docker compose template here and we simply can just copy this content here and modify our stack or you could just create a different stack for that but i'm putting this into the same stack for the simple reason that we have one isolated docker network and i don't need to expose the ports of the exporter somewhere because this could be dangerous right the simplest way is to just deploy it in the same docker stack here so just paste the information here and usually you don't need to change anything in here i also want to know scrape the metrics of my darker containers this is something that node exporter doesn't do so we need a different exporter to do that so let's go back to prometheus exporters and let's search for docker you will find a darker demon exporter here but i'm using a different project that is called cadvisor so cadvisor is maintained by google and you can see that it provides a container user to understand resources of usage and performance of containers so it scrapes things like processes exports memory cpu usage and those kind of things so if you want to see metrics for your individual containers this is a really nice exporter to use and you can find a docker command to quickly run that or again go to my boilerplates and just use the docker compose template for that so you want to copy this as well and i've modified the basic version a little bit so you might just check it out and see if it works for you and just add this to your stack configuration file as well so now we want to update the stack and deploy everything here and you can see that it's now updating the stack and has started two additional containers to scrape the metrics of our darker containers and of the physical or virtual server anyway the host operating system right so no of course we have deployed those two tools that will know scrap and collect the metrics but we also need to get these metrics inside prometheus and of course we need to configure the configuration file and add some jobs for it so the first job is an example job for the node exporter and i just want to uncomment these lines here and configure it a little bit so you can see i'm also using the dns names here and this is only possible if you have deployed all the containers in the same docker compose stack right because only then they are located in the same local isolated network of docker and they can find each other by dns name otherwise you probably need to put in the ip addresses the second job is for cadvisor and you can see that it's not really so complicated you basically can just repeat the same pattern here and you don't need to configure anything else because prometheus has a standardized interface or standardized language to pull those metrics and all exporters that i've written will use the same standardized way of making these metrics available to prometheus so you don't need to configure anything else here what type it is prometheus doesn't really care which application it is so let's write and exit this file and restart our prometheus container here and now i want to go back to the prometheus web interface here it will reload the page so that probably will now take some time but let's go to the targets here and after reload you can see it's already up i didn't change anything here so sometimes you need to wait a few seconds once you redeploy the container but now if we go to the main page of prometheus and go through the different metrics you can see a bunch of different stuff that is added here for example the container underscore metric is from cadvisor and we also have the node underscore for the node exporter here so let's try to visualize this data in a useful way so we can understand what is happening here and we are using grafana to do that we already deployed rafana in portana so we can simply just open the web interface on port 3000 and this is the login of grafana as you can see it's not encrypted but it has a login so you probably want to expose or do access control i i already talked about this so let's use the default username and password which is admin admin now we just need to simply follow the instructions here in the web ui and first of all add our data source so now we can select which data we want to use to visualize because you can see graphana can be used for a variety of different monitoring and storage systems here so not just prometheus but also influx db elasticsearch mysql a lot of other things that can be integrated with this system it's it's very flexible and versatile such as prometheus is so it's really really cool of course we want to select prometheus here and now we need to tell graphana where to look for the prometheus server so now we can simply just go here use the ip address or simply refer to it by name because it should also work so enter http prometheus or your rp address on port 1990 and let's test if we can reach it and you can see the data source is now working everything is up and running and grafana is now able to visualize data from the server so no you can create your first dashboard and start with grafana and you can see it's not presenting you anything as i said it's very flexible it's highly customizable and it's meant to be customized and created by you so you can start with an empty panel here and basically start scraping some data and build your own dashboard here so in the right panel you can select which type of metric you want to visualize if it is a time-based series a bar chart or a simple statistic or table anything like this so there's a lot of stuff in here you can customize a whole dashboard with that let's let's try a time series here and in the metrics browser we now can select which metric we want to collect you know you can see that is all this stuff that we have seen in prometheus right so for example i i remember something that we probably want to visualize something like a memory so we're using the container memory rss and use this as a query here so you can see no starts to visualize something but it doesn't really make any sense at all so as i said the query language is something that you would need to learn if you want to do that manually but don't worry i will show you a very easy way you could also customize some of the graph styles here this is very very cool you can do a lot of stuff with that it's really nice but of course you don't want to build all this stuff yourself because then you probably would spend a few months grafana also has a library where people share their own dashboards because every dashboard needs to be customized for those specific exporters right so when you go to the graphana homepage here and go to dashboards you can basically search through a library to get the metrics or to visualize the metrics in a way you want to see and you basically just need to search for an exporter here for example note exporter and then you will find a bunch of different dashboards you can just use them and try them all out one thing that i've used and i find very useful is the note exporter full so every dashboard has a unique identifier so this is dashboard i want to see here for example let's click on that and it has a lot of great positive reviews which which is pretty nice here and you can just copy this dashboard id here and now we want to import a new dashboard here so click on import and paste in the identifier of the dashboard ui you have found in the graphana library here load it and then you can import it you also need to select the prometheus as a data source and this is basically the metrics for the node exporter the host operating system statistics the cpu busy system load stuff memory used swap file used and a bunch of other different things like the cpu cores the uptime and some other useful information here you can also see that there is much more stuff in the drop down menu here if you want to get information about the network traffic here for example there are many different metrics or you can monitor or troubleshoot some of the errors you will see on your network card interface for example the network traffic errors that would raise if you got some network issues here or you can see some metrics about the storage this the input output operations and those kind of things so it's really really nice and of course there's also a nice dashboard for cadvisors to see some container metrics here so let's go back to the grafana dashboard and let's search for cadvisor i think this one is what i've tried out which was working pretty well so let's just copy this id here and now you can see this is a basic dashboard for the cadvisor exporter where you can see all the container statistics such as cpu and memory utilization usage and this is a really really cool thing you can customize all this stuff as well so you can go in here edit this and just modify the queries and do a lot of stuff here to fully customize the dashboard to all of your needs okay so i hope this was interesting and you are now monitoring your servers with prometheus and grafana of course we have just stretched the surface here there's much more to learn about server monitoring like a centralized log management the prometheus alert manager or other data sources in grafana so there are so many topics i could all do separate videos about and if you want to watch some more interesting stuff i've put you a suggestion for a fantastic video you should watch next in the description below so thanks everybody for watching and i see you in the next one bye
Info
Channel: Christian Lempa
Views: 294,313
Rating: undefined out of 5
Keywords: prometheus and grafana, prometheus and grafana installation, prometheus and grafana monitoring, prometheus and grafana setup, prometheus and grafana tutorial, grafana prometheus, grafana prometheus dashboard tutorial, prometheus architecture explained, prometheus monitoring, prometheus monitoring demo, prometheus monitoring tool, prometheus monitoring tutorial, prometheus monitoring with grafana, prometheus setup, prometheus tutorial, what is prometheus monitoring
Id: 9TJx7QTrTyo
Channel Id: undefined
Length: 24min 35sec (1475 seconds)
Published: Tue Sep 21 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.