Prometheus, Alert Manager, Email Notification & Grafana in Kubernetes Monitoring | Merciboi

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello friends welcome to this demonstration in this video we will see how to use Helm Helm is a package manager for kubernetes we would use him to install Prometheus in our kubernetes cluster and grafana also in a cubones cluster Prometheus is a monitoring tool which we intend to use to monitor our cluster the nodes the infrastructure the applications running in a kubernetes cluster and when Prometheus does this task we would use grafana to visualize the data script by the Prometheus now along with Prometheus come four other packages which are installed at the same time when we wouldn't stop Prometheus those packages are the Aller manager the Prometheus node exporter the cube State metrics and the Prometheus push Gateway so but of note for us we would also configure all of these they are already configur in the promethos charts which we would see in a moment we would also configure alert manager to be able to send email notifications to the configured email addresses in the settings again we would be able to visualize the user interfaces of these three applications to see how they work and how we can configure every part of this project to work together one with another so Prometheus would script data real time data from our kubernetes cluster and send visualizations send data for visualization in agraf and send alert to the alert manager the aler manager in turn sends the alert as emails to our Gmail address which we will specify so please feel free to subscribe to this channel if you have not done so like the video share it if you have any comment please put them in the comment section happy watching this demonstration is brought to you by messy boy systems Solutions please feel free to check us out on this website on your screen this demonstration is to show how to install configure and use prus monitoring tool alongside aler manager and gra for data visualization so Prometheus is quite an important monitoring tool especially for kuber clusters so we're going to be installing configuring a making use of Prometheus alongside other applications attached to it to monitor our kubernetes cluster this is also able to be used in both uh production live scale production clusters and every task that we carry out here is also applicable in a large uh scale production clusters kubernetes clusters for me right now I'm using my local host and I'm using my kubernetes cluster in my local host I'm using mini cube a single node cluster right here and you can also use this in mini Cube you can also use it in kubernetes clusters cubm Cloud kubernetes clusters uh eks Amazon can use Google kubernetes engine can you can use Azure kubernetes service and you know it's still the same principle that uh governs all of these clusters and probius is able to provide you with such uh an added advantage to be able to monitor your cluster and your applications so let's get right into it a quick look at my Crosser right here so I have you as I mentioned earlier so it just got so here right I have one service the service unad Med right so that's my Min Cube component so everything is all set for me to run now first thing do is to install Helm in your kubernetes cluster right uh below this video in in the description there's a link to the GitHub where you'll find every reference and every link you need to be able to run this task from beginning to the end so but for me I already have Helm installed here if you don't have go ahead and run the command just like you see it right there you know with the reference so I have Helm so to be able to use my Helm I want to search don't haveit here for okay I also don't have any application here use Helm for you know Helm is application manager for kubernetes so to go on I'll first of all stall Helm it's already done so the next step is to be able to install Prometheus in this cluster using Helm okay the first I need to do is to add the repository for the helm charts for Prometheus this command does that so this is the prus community so if I do um l i find that repository added already you've done that you can update to repository case there any update download so that's done so after that I want to search the repository chart so I can choose which chart I want to so I'm searching M triple PR Community it's quite a lot here so but for us what we need for this task is this Prometheus Comm Prometheus all right and part of the default configuration for this chart is that it have dependencies and the dependencies would be installed alongside this so I don't have to add to install these other ones manually those dependencies are the Aller manager he also has the cube State metrics also has the Prometheus node exporter and we also have the Prometheus push Gateway so these dependencies are part of what will be installed once I install Prometheus in this cluster so let's get right into it so install that R Helm install I want to call the release name promethus and I'm installing Prometheus Community SL Prometheus now before I run this installation I want to add to this installation custom configuration to create my alert rules for my alert manager to use so I would add the configuration but I have not already added it at the moment so I would commend this and try to create the file so I'll create a directory call the directory Prometheus because I want to that file right inside this directory C into promethus the file I want to add I want to call that promethus TL this content of this file is also found in the repository for which I have provided the link under this video so this is the file I want to add and every stepbystep instruction just right here so I want to copy the content of this file from. yes I have to copy all of them that come into this file everything so these are rules I have usually two headings in these rules I have Noe down and I have low memory allert you can have as many rules as you want depending on your use case all right so note down has only one alert there instance down so when these conditions are fulfilled the alert is triggered then we have the low memory alert the First Alert is low memory when these conditions are fulfilled all right and uh again have CBE persistent volume arrows when these conditions are fulfilled the is triggered again we have CP crashing Loop when these conditions are fulfilled it triggers this alert and finally we have Cube P not ready when these conditions are fulfilled it triggers the alert just like I said earlier you're free to add as many rules as you want depending on your used so for me good with this so I save and quit now I can run this command hel install I'll pass flagus be able to add this custom configuration these custom allting rules into to this so that is done so Prometheus is already installed here okay so the next thing I will need to do is to be able to connect to my Prometheus server UI and to do that there are multiple ways to get to it the first is to run this command once I run this command I'll be able to assess the UI from this local system all right the second is by using the cube CTL expose service command and the third is by using Ingress Engineers Ingress which would also take some configurations so but for now for this task I will run with this command for this first time so I copy this command get into another terminal right here P this command it runs and gives me an assess and because I'm running this from my local system this tab has to be open for me to be able to access this so I'm connecting Local Host on Port 90 90 so let's connect to it to our browser Local Host 9090 so this is a Prometheus right now all right so if you go to alert these are the rules I mentioned earlier two rules basically not down and low memory alert and of course these are the alerts remember we had we saw four Aller you know in the second rule okay this already active we get to see that in a moment and the conditions are actually below them all right each of the all has the conditions in there so feel free to explore this user interface for your promeets all right so many information want to see the rules have the rules here first rule note down all right instance down when an instance is down this is triggered anyway but we're not going to be able to see that uh low memory alert four alerts first one second third and fourth all right so uh you have targets you have end points so these are your end point because we have only one it's a single node cluster so we have only one node all right so we have that that's the one end point in our node keep this noes and we have only one API server because we have uh uh uh a one control plane all right so thec advisor kuber service endpoints metrics the Prometheus itself this is the Prometheus we're connected to right now the push Gateway and so on and so forth so feel free to be able to look around all of these so back to the all so we've been able to access our promethus server on our graphical user interface now the next thing we want to do is to be able to access our alert manager server all right so to assess our aler manager the same methods are available either you run this command here or we use the cube CDL expose command or we use ingenic Ingress which as as I mentioned earlier will require some configurations I've used this for the first one so for the second one I will use the qcd expose command okay but before I run that let's look at the service for the services we have qctl get service so this is a service for theer manager that we want to assess all right so I want to expose that right now so command is qctl Expos service and the service is promethus L manager on the type service I want to create Target for would be 9093 to the Target Port forer manager 9093 and I want to call this new service I want to call that aler manager assess call that aler manager assess there's an error right here so let quickly break that this promethus aler manager the name of the service right here and everything looks fine oh an error alert manager repeat the command did that alert manager done so we've run that so if we run tctl get SVC now we'll see theot manager assess we don't have any external IP that's because I'm using mini cube in my local environment all right if if you're on the cloud at this point you you will simply use the IP address of your server of your node alongside the node Port provided here here and you're able to access uh the allert manager but for me working in mini cube right here I would have to do more than that and to be able to assess that I will copy this into my new tab right I get into this this my new tab right here because I'm using miniq by R mini Cube service start allot manager assess which is name of the service so V Cube is able to map it and give me uh an external URL to access so I'm going to be assessing this URL from my local environment so I've copied the URL 1271 is equivalent to Local Host of course the p is 45119 so I'll get into my my uh browser to use that so Local Host 45419 will get me to myot manager so this is myot manager right here so right now I have one alert active so the same alert is active here one alert a few things to take note of we have the default receiver here but that's not what we want to use uh take note of this if you get to status stab is ready so that's no problem now but the configuration I want to change I will do that in the next step to be able to receive notification email notification right in my email when something goes wrong when when an aler is triggered it sends an email to me to be able to receive that and attend to the problem and solve whatever it is that needs to be solved so we're going to get to that in a moment that's the next step we're going to run now silences is when you when there's an aler here and you want to silence it maybe because you are expecting that alar to be so you can simply silence it so that it doesn't send you a lot messages so once you silence any alert Point here it appears in your silen is right here okay so so that's that's that's that and uh the the default receiver here is because of this configuration in status where we have the receiver as the default receiver but we're going to change that in the next step so we've been able to assess our alert manager on the UI so let's quickly go and configure email notification for our alert manager so we get back to our command line from there quick look at what we have at the moment all for me I'm working in the default name space P for everything all right that's because uh that's what I choose to do whatever your environment is production environment you have different name spaces please feel free to specify the name spaces and you know it makes your work much more organized at the higher level so this is what I have in the overall so to be able to do that to make n it in the configurations that was used to set up Prometheus alert manager and that is through the config M so for me to make such edit in the config map let me get the config Maps I have available and I have two of them related to promethus but I'm interested in Prometheus alert manager because that's word I want to change or edit something in it so that I can have my email notification getting to me whenever there's a problem so I will simply edit this so to edit that run my command edit config mapi prome aler manager and I will be editing from this point Global from this point Global so I can BR the receiver so this is where we have the default configuration that we saw in the in alert manager UI so this is where we have that configuration so to be able to change that let's get to the configuration we want to use there so back to the same GitHub we've been working with so the component is here Prometheus allow yl so you simply copy from Global to the end I will explain the component in a moment so copy from Global to the end back into my command line so it's that right in here now finish up the editing so resolve timeout is 1 minute that's fine for me can be any can be 5 minutes can be 2 minutes can be anything Gmail notification is what I choose to call it it can be anything so the email configurations we have to put working and active email addresses or address as the case may be so two is where you want to send the email to I choose to send the email to this email address you can have multiple email addresses all you need to do is simply specify all the email addresses you want to use and so on so forth you have all of them in there but I don't need this at the moment so I I am using only one email address and the next thing is the from email address the from email address is where the email will originate from and I choose to use the same it can be anything be any email address any Gmail address smart host is constant for Gmail smtp.gmail.com 587 the off username must be the same thing as the from because that is where this application will get authentication from to be able to send the email to the email addresses you have in your two at this point so there must be the same thing so I have to put in there m boy3 m boy 2023 for password could put your email address password here this the password for this email address which is what this application will use to authenticate itself to be able to send the email however it is not safe to put your password right here it can come in many ways you can put the password here as encoded secret you create a secret and be able to pass the reference here or to keep it as simple as possible there something we call app password in Gmail so we simply get to our Gmail account and create an app password with which we can attach to that so into our email address you just go to uh manage your Google account it gets you to security gets you to this point gets you to home you click on security when you in security you simply come to search here and sech for password before you can use this app password you must have two Factor authentication enabled for your Gmail account so you click on app passwords it requires you to put in your your password for authentication that you are the one making this change you simply put in your password in there and it lets you get in once you get in I don't have any app passwords now so I will go ahead and create one now let me call the app pass I want to create let me call it KS I want for kuet it can be anything I'll create that now please make sure you copy this so once you copy this because you won't see it again after now so you copy this click done once you click done it's been created so with the password you copied password you copied you come in here and edit this and editing it remove the spaces that's it everything is done and ready but remember we need to remove something from here so we remove this we remove the default receiver we don't need it again everything on that default receiver we remove it and that's it that's everything we save quit so it says config alert manager edited however for aler manager which is this has been created before we made this edit so this this edit would not be infected here because this port is already running and because it will not be infected here because the port is running if you get to the UI also it will not be affected I'll refresh this it would be affected it's still the default receiver because the p is already running now to be able to affect that you simply delete this port now when you delete this port it's going to be recreated the stateful set from where it is deployed so the alert manager was deployed as a stateful set so if we delete the port the state set we recreate it so let's get P again and then run the command to delete the port keep delete Port the P you want to delete is fromus aler manager zero once you delete that and check out again it's being recreated the P it so the New Port is now running all right so now that it's running we get back to our UI all right refresh this and you see the change has been affected GL Global Gmail notifications and all our settings have been applied so right now from this point I can receive Gmail alert from my monitoring from alert manager at this point so let's get right into it back into our Prometheus we have one alert that's active here low memory and we have one alert active here okay the low memory seem to be resolved but it's being refreshed refed this is being refreshed so just give just give you a little while because we just changed the configuration it will reload the alert at the specified times now the time for the low memory is every 2 minutes 2 minutes so in next two minutes I will have an alert trigger right here the time for virtually every one of them is for 2 minutes c not ready 2 minutes so so at the uh went two minutes is gone we will be able to get an alert you know triggered here because of the change that we have made okay right now we have our aler right here one alert which is the same thing we have here here one alert active and our default receiver is gone we now have Gmail notifications so it's going to send me a Gmail alert which has already come in Prometheus alert one alert for low memory all right so that's it so I have one alert it will keep sending the alert until it it's it's resolved it keeps sending the alert until it resolves okay so we have one alert I'll make a sample deployment that's going to create another alert so this EML file contains a manifest that's not properly configur so we simply apply that and then if we say if we get PA image pool so the image is not ex does not exist all right so it's going to create an allert in my proos and in my alert manager okay so remember it's for two minutes remember it's for 2 minutes so if the part doesn't come up in 2 minutes you send an alert and so that's the beauty of Prometheus as a monitoring tool all right uh he's able to send you an alert and you simply go in troubleshoot whatever is triggering the alert and solve the situation as quickly as possible so so as to maintain your application and make sure there's no downtime all right so that's the beauty in two minutes we're going to have the alert triggered and we're going to receive an email alert for that so let's wait for it okay we have our lot here already Cod not ready all right so let's take a look at our alert manager is yet to come in here is yet to come in here so the process is it's picked up by the Prometheus server and then it it's forwarded to the allot manager from where the email notification is sent to whichever email address or addresses that you specify in the configurations Okay so okay right now we have the two alerts here and he has already sent an email notification for the one alert new alert now okay this is still for the low memory okay I have one only for one alert I'm going to get another email soon for the two alerts specifying the two alerts the low memory and the qod not ready okay right now I have the email alert for the2 alert that have been triggered the low memory alert and the qod not ready so it will keep sending these alerts until I get in to my kues cluster and solve whatever problem that has happened or whatever has triggered the alarm so that's it about aler manager and Prometheus in action so the next thing we're going to do is to install grafana which is our visualization tool okay so to get right into it back to our command line first thing we need to do is to add C charts the repository for the h charts okay so that done we up report update updated so let's search the report search I want to search gra okay CH quite a lot but what we will need now is just the graph grafana that's what we want to install now so let's get right into it we install install it's called the release name grafana and we installing graph grafana I'm installing in a default name space I have a different name space I want to install I'll simply the name space flag and you know there but doing everything on the default namespace feel free to do it whichever name space you like so installing grafana grafana now okay that has been installed um the next a look at this is that you will log in with an admin user and with a password which you will get when you run this command so let's run this Command right here so this is the admin password for the login okay now to access the grafana URL you will run this command to assess that okay so again you can can also access the the URL with the second option which is to expose the service the gra final service or you can also use the third option which is with engine X Ingress so but for now I'll run this command go to my other tab this tab is still running I'll get a new tab right here start so my grafana is running on my local host for 3,000 so I get into my my UI get into Local Host for 3,000 it requires me to log in remember it's admin and we will go to copy password we obtained here copy that back the UI the password in there login so we sucess logged into our graph UI so the first thing we need to do is to add our data source the data source from where we want what we want to display in graph and we are using Prometheus at the moment now it's asking for Prometheus server Ur URL and you choose to call this Prometheus it can be anything and you choose to make it the default you know it can be anything you can you can actually uncheck this and not call it your default so because I'm running in my local host and I'm working with Prometheus from here I wouldn't be able to get into grafana with Local Host 9090 I would have to expose the promethus service to be able to get a mini Cube service URL to be able to access my Prometheus server from the grafana UI so what do I do from here I simply run cctl get services and the service I'm interested in now is the Prometheus server because I want a URL I would add to the graph UI to be able to access that all right the URL that is kind of different from the Local Host okay so to get that I'll do cctl expose service service I want to expose is promethus prus server and the type service I want to create is not port and the target p is server is Port um for 9090 the port I we used to get it and I want to call that Prometheus ass call the Prometheus asses okay so Prometheus is exposed already so if I run CD get service I'll find the Prometheus assets right here but with no external IP now to get an external IP I'll run mini Cube serviceus okay so so it gives me the URL so I'm going to be using this to be able to connect to my Prometheus my UI is that in there leave the default settings save and test so successfully query the Prometheus API if we had used the Local Host it would not have allowed me to get into it all right again all of these things are because I'm working on my local system if you on the cloud once you expose the service it gives you once you expose the service either as a notep service or even as a load balancer it gives you an external IP to access and with that you can actually you know access that so this has been saved now so I've this is successful so I want to get into my dashboard I can get out from this point now okay I have the data source already put there and so let me get into my dashboard and uh so create my first dashboard okay I want to import a dashboard okay so from here instead of creating dashboards by yourself adding visualizations and creating rules and panels you can simply import already you know provided dashboards using their their respective IDs or URL you can search out these anyway but I have a few of them right here so I have 6417 so if I load this you this uh dashboard all right and uh I choose to call it this they call this kubernetes cluster Prometheus you can call it whatever you want all right folder is General it has the uid now you select the data source which is which will appear here with v we've actually you know created this data source a moment ago we call it our default Prometheus so we put it in there Prometheus and simply import okay so this is our dashboard so we have a dashboard right here if you want to save this you can go go ahead and see the dashboard all right you can go ahead and save it you can have as many dashboards as you want okay so here gives us you know generally general information cluster Port capacity cluster Port usage CPA usage memory capacity disc capacity deployment replicas the deployment replicas you have okay so you able to see all of these you know number of nodes I have only one note I have 13 PS running and I have one p pending remember there's one p that didn't start because we used the wrong image we wanted to use that to create an alert containers running 14 containers containers waiting one is waiting terminated zero restart not appliable okay so here you able to to visualize everything you want this is for noes all NES all name spaces right if you have a specific name space you want to visualize you simply put in the name space if you want a specific node you want to see you simply put in the node okay I'll save this dashboard all right I'll save that I've Sav the dashboard so I can go back to my home I want to import another dashboard I can import another dashboard let me say I want to import 3119 I load it once I load it I create I select the data source promethus and I import that so this is another dashboard all right Network input output pressure all right data data source is from default which is a Prometheus server we have node is all nodes if you have different nodes you can actually select them whichever one you want to visualize the Matrix you simply select it right here all right and you have uh total cluster memory usage you have this is being used out of 6.69 GB cluster CPU usage okay you can see everything here out of 16 you know CPU we are using 0.53 all right so this keeps you informed of what exactly you using ports memory these ones so container Network and so on and so forth ports Network so all these configurations can be made just as you as you like so I can save this one and save that and get back to my dashboards I want to import another dashboard let means import 11074 I load that select my prus data source import it so this is my 1104 dashboard CPU usage memory usage partition used disk rights and so on uploads and downloads kilobits per second all right the IP Link right there okay so uh KU service points you have values already anyway so because I have resources running in this cluster so please take your time and take a look at all of these all right some dashboards are more detailed than the others CPU usage you can see that all right there there are some data here there's some data here netork bandwidth this right R right okay so I can choose to save that and go for another dashboard import another dashboard let's import 8919 okay this appears to be in another language but nevertheless it's still a possible dashboard all right so if you can understand that that's fine all right that's Al that's also a dash okay so I can choose to save that and the final dashboard I have here there are many of them online so you can use as many as you want depending on your use case data source is still Prometheus okay so you have that pause running count pending one no field node storage CPU idal average memory low dis containers and so on and so forth so that's the task so our monitoring kues cluster is up and we are good with them and we're fine at the moment so we have all our dashboards here so you can quickly go to whichever dashboard you want to see okay so this cluster this is our the the first one we had all right go back into it and you know you can pick as many dashboards as possible dependent on your use case so whichever one you prefer to use please feel free and use that I hope you've enjoyed the video all right I have allots will keep coming in until I address them or I silence them all right so that's the task I have this GitHub repository attached here please feel free to to get it and try the project yourself everything that we've done here are all applicable to production kubernetes clusters so they are real and they are viable so please if you have not subscribed yet please subscribe like this video comment on it share it subscribe to the our Channel and you'll be notified of our recent videos thank you very much for watching see you
Info
Channel: Chukwudinma Ikechukwu Akabuogu
Views: 1,929
Rating: undefined out of 5
Keywords:
Id: TyBsKMTDl1Q
Channel Id: undefined
Length: 54min 47sec (3287 seconds)
Published: Wed Oct 18 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.