How to Install Prometheus and Grafana on Ubuntu? (Node Exporter & Alertmanager & Pushgateway)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Today, we're going to install Prometheus, Node Exporter, Pushgateway, and other monitoring components on Ubuntu. To visualize metrics, we will use Grafana. Also, I'll show you how to secure Prometheus with username and password. Finally, we will install Alertmanager and configure it to send notifications to the Slack channel. You can find all the commands that I run in the video in the blog post. The link will be in the description. First of all, let create a dedicated Linux user or sometimes called a system account for Prometheus. Having individual users for each service serves two main purposes: It is a security measure to reduce the impact in case of an incident with the service. It simplifies administration as it becomes easier to track down what resources belong to which service. To create a system user or system account, run the following command. --system - Will create a system account. We don't need a home directory for Prometheus or any other system accounts in our case. --shell /bin/false - It prevents logging in as a Prometheus user. Will create Prometheus user and a group with the exact same name. Let's check the latest version of Prometheus from the download page. You can use the curl or wget command to download Prometheus. Then, we need to extract all Prometheus files from the archive. Usually, you would have a disk mounted to the data directory. Also, you need a folder for Prometheus configuration files. Now, let's change the directory to Prometheus and move some files. First of all, let's move the prometheus binary and a promtool to the /usr/local/bin/. promtool is used to check configuration files and Prometheus rules. Optionally, we can move console libraries to the Prometheus configuration directory. Console templates allow for the creation of arbitrary consoles using the Go templating language. You don't need to worry about it if you're just getting started. Finally, let's move the example of the main prometheus configuration file. To avoid permission issues, you need to set correct ownership for the /etc/prometheus/ and data directory. You can delete the archive and a Prometheus folder when you are done. Verify that you can execute the Prometheus binary by running the following command. To get more information and configuration options, run Prometheus help. We're going to use some of these options in the service definition. We're going to use systemd, which is a system and service manager for Linux operating systems. For that, we need to create a systemd unit configuration file. Let's go over a few of the most important options related to systemd and Prometheus. Restart - Configures whether the service shall be restarted when the service process exits, is killed, or a timeout is reached. RestartSec - Configures the time to sleep before restarting a service. User and Group - Are Linux user and a group to start a Prometheus process. --config.file - Path to the main Prometheus configuration file. --storage - Location to store Prometheus data. tne the address - Configure to listen on all network interfaces. In some situations, you may have a proxy such as nginx to redirect requests to Prometheus. In that case, you would configure Prometheus to listen only on localhost. --web.enable-lifecycle -- Allows to manage Prometheus, for example, to reload configuration without restarting the service. To automatically start the Prometheus after reboot, run enable. Then just start the Prometheus. To check the status of Prometheus run following command. Suppose you encounter any issues with Prometheus or are unable to start it. The easiest way to find the problem is to use the journalctl command and search for errors. Now we can try to access it via browser. I'm going to be using the IP address of the Ubuntu server. You need to append port 9090 to the IP. If you go to targets, you should see only one - Prometheus target. It scrapes itself every 15 seconds by default. Next, we're going to set up and configured Node Exporter to collect Linux system metrics like CPU load and disk I/O. Node Exporter will expose these as Prometheus-style metrics. Since the installation process is very similar, I'm not going to cover it as deep as Prometheus. First, let's create a system user for Node Exporter by running the following command. You can download Node Exporter from the same page. Use wget command to download binary. Extract node exporter from the archive. Move binary to the /usr/local/bin. Then, clean up, delete node_exporter archive and a folder. Verify that you can run the binary. Node Exporter has a lot of plugins that we can enable. If you run Node Exporter help you will get all the options. We're going to enable login controller, just for the demo. Next, create similar systemd unit file. Replace Prometheus user and group to node_exporter, and update ExecStart command. To automatically start the Node Exporter after reboot, enable the service. Then start the Node Exporter. Check the status of Node Exporter with the following command. If you have any issues, check logs with journalctl. At this point, we have only a single target in our Prometheus. There are many different service discovery mechanisms built into Prometheus. For example, Prometheus can dynamically discover targets in AWS, GCP, and other clouds based on the labels. In the following tutorials, I'll give you a few examples of deploying Prometheus in a cloud-specific environments. For this tutorial, let's keep it simple and keep adding static targets. Also, I have a lesson on how to deploy and manage Prometheus in the Kubernetes cluster. To create a static target, you need to add job_name with static_configs. By default, Node Exporter will be exposed on port 9100. Since we enabled lifecycle management via API calls, we can reload Prometheus config without restarting the service and causing the downtime. Before, restarting check if the config is valid. Then, you can use a POST request to reload the config. Now you should have a new target in the Prometheus. To visualize metrics we can use Grafana. There are many different data sources that Grafana supports, one of them is Prometheus. First, let's make sure that all the dependencies are installed. Next, add GPG key. Add this repository for stable releases. After you add the repository, update and install Garafana. To automatically start the Grafana after reboot, enable the service. Then start the Grafana. To check the status of Grafana, run the following command. Open the browser and log in to the Grafana using default credentials. The username is admin, and the password is admin as well. When you log in for the first time, you get the option to change the password. Let's use devops123 for the new password. To visualize metrics, you need to add a data source first. Click Add data source and select Prometheus. For the URL, enter http://localhost:9090 and click Save and test. You can see Data source is working. Usually, in production environments, you would store all the configurations in Git. Let me show you another way to add a data source as a code. Let's remove the data source from UI. Then, create a new datasources.yaml file. Optionally, you can make this data source as a default one. Restart Grafana to reload the config. Go back to Grafana and refresh the page. You should see the Prometheus data source. We can import existing Grafana dashboards or create your own. Let's create a simple graph. Go back to the Prometheus, and let's explore what metrics we have. Start typing scrape_duration_seconds and click Execute. This metric will show you the duration of the scrape of each Prometheus target. At this point, we have node_exporter and prometheus targets. We're going to use this metric to create a simple graph in Grafana. Go to Grafana and click create Dashboard and then add a new panel. Give a title Scrape Duration and paste scrape_duration_seconds metric. You can also reduce the time interval to 1 hour. For the legend, we can use the job label and for the unit - seconds. There are a lot of configuration parameters that you can use. Let's keep it simple and click apply and save dashboard as Prometheus. Since we already have Node Exporter, we can import an open-source dashboard to visualize CPU, Memory, Network, and a bunch of other metrics. You can search for node exporter on the Grafana website. Copy 1860 ID to Clipboard. Now, in Grafana, you can click Import and paste this ID. Then load the dashboard. Select Prometheus datasource and click import. You have all sorts of metrics here that come from node exporter. Next component that I want to install is Pushgateway. The Pushgateway is a service that allows you to push metrics from jobs that cannot be scrapped. For example, you can have Jenkins jobs or some kind of cron jobs. You can't scrape them since they are running for a limited time only. The installation process is very similar to Prometheus and Node exporter. Create a dedicated user first. Download archive with Pushgateway. Extract all the files. Move pushgateway binary to to /usr/local/bin. Then, clean up. Check if Pushgateway can be executed. Also, you can get configuration options by running help. Create a systemd service. Enable the service. And, start Pushgateway. Check the status. Pushgateway can be reachible on port 9091. Let's add Pushgateway as a target to Prometheus. Check Prometheus configuration. If it's valid, reload the config. Make sure that the target is up and healthy. To send metrics to the Pushgateway, you just need to send a POST request to the following endpoint http://localhost:9091/metrics/job/backup. Where backup is an arbitrary name that will show up as a label. Use curl and pipe the string with echo to Pushgateway. Let's imagine that the Jenkins job that we named backup took almost 16 seconds to complete. You can find this metric in Prometheus. Refresh the page and start typing jenkins_job_duration_seconds. When you install Prometheus, it will be open to anyone who knows the endpoint. Fairly recently, Prometheus introduced a way to add basic authentication to each HTTP request. Used to be you had to install a proxy such as nginx at the front of Prometheus and configure basic auth there. Now you can use a built-in authentication mechanism in the Prometheus itself. Let's install the python module to create a hash of the password. Prometheus will not store your passwords; it will compute the hash and compare it with the existing one for the given user. Now, create a simple script that will ask for input and return the hash for the password. Run the script and enter devops123 for the password. Copy this hash and create an additional Prometheus configuration file. Now, we need to provide this config to the Prometheus. Let's update the systemd service definition. Every time you update the systemd service, you need to reload it. You also need to restart Prometheus. And check the status in case of an error. Now, we can test basic authentication. Go to Prometheus and reload the page. Enter your username and a password. If you go to the targets section, you will see that the Prometheus target is down. Prometheus requires a username and password to scrape itself as well. We also need to update the Grafana datasource to provide a username and password. If you click test, you get an unauthorized error. Let's update the datasource config for grafana to include basic auth. Restart grafana. Next, let's update the Prometheus target to include usermane and password. Check the Prometheus config and reload it. To reload you need to include username and password. Test grafana datasource. And Verify that Prometheus target is up. To send alerts, we're going to use Alertmanager. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or in our case Slack. You can set up multiple Alertmanagers to achieve high availability. For this demo, I will install a single one. First, let's create a system user for Alertmanager. Then, download Alertmanager from the same downloads page. Extract Alertmanager binary. For Alertmanager, we need storage. It is mandatory (it defaults to "data/") and is used to store Alertmanager's notification states and silences. Without this state (or if you wipe it), Alertmanager would not know across restarts what silences were created or what notifications were already sent. Now, let's move Alermanager's binary to the local bin and copy sample config. Remove downloaded archive and a folder. Check if we can run Alertmanager. You can also get help and all supported configuration options by running Alertmanager help. Next is the systemd service definition. Enable alertmanager. Start Alertmanager. And check the status. Alertmanager will be exposed on port 9093. It's time to create a simple alert. In almost all Prometheus setups, you have an alert that is always active. It is used to validate the monitoring system itself. For example, it can be integrated with the deadmanssnitch service. If something goes wrong with the Prometheus or Alertmanager, you will get an emergency notification that your monitoring system is down. It's a very useful service, especially in production environments. Let's create alert but without integration with DeadMansSnitch. You also need to update the Prometheus config to specify the location of Alertmanager and specify the path to the new rule. It's always a good idea to check Prometheus config before restarting. Now we have a new alert. Alertmanager can be configured to send emails, can be integrated with PagerDuty and many other services. For this demo, I will integrate Alertmanager with Slack. We going to create a slack channel where all the alerts will be sent. Let's create alerts Slack channel. Next, create a new Slack app from scratch. Give it a name Prometheus and select a workspace. You can modify the app from the basic information tap. Let's upload the Prometheus icon. Next, we need to enable incoming webhooks. Then add webhook to the workspace. The last thing, we need to copy Webhook URL and use it in Alertmanager config. Now, update alertmanager.yml config to include a new route to send alerts to slack. Any alerts with label severity equal to warning will be sent to slack. Restart alertmanager. Now we need to include batch-job-rules.yml in Prometheus configuration. Create alert to test Slack integration. Add a new rule to Prometheus. Check the config and reload Prometheus. Trigger the alert by sending the new metric to Prometheus Pushgateway. In a minute or so, you should get a message in Slack. If we send a new metric with a duration of less than 30 seconds, Prometheus will resolve the alert. If you are interested in deploying Prometheus to Kubernetes, I have another video. Thank you for watching, and I'll see you there.
Info
Channel: Anton Putra
Views: 27,598
Rating: undefined out of 5
Keywords: Install Prometheus and Grafana on Ubuntu, Install Prometheus on Ubuntu 20.04, Install Node Exporter on Ubuntu 20.04, Install Grafana on Ubuntu 20.04, Install Pushgateway Prometheus on Ubuntu 20.04, Securing Prometheus with Basic Auth, Install Alertmanager on Ubuntu 20.04, Alertmanager Slack Channel Integration, prometheus, grafana, alertmanager, pushgateway, slack, node exporter, devops, sre, anton putra, monitoring, prometheus installation on linux, grafana tutorial, prometheus tutorial
Id: Z7GxBf6us8Y
Channel Id: undefined
Length: 17min 55sec (1075 seconds)
Published: Mon Jan 10 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.