Monitor Your System with Grafana using Netdata and Prometheus

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
previously i installed netdata on my open mediavault server netdata is really easy to install it's just one line one command go ahead and check out the previous video if you want to know how to do that because you will need net data installed to follow along with today's project which is taking bits and pieces of what you see here there is a lot to digest here you can see a lot going on here so what i'm going to do is i'm going to take bits and pieces that i want to see on a custom dashboard using grafana and prometheus and this is how it's done [Music] so jumping right into it i created this awesome list of things that you'll need moving forward into this project first thing you're going to need is net data already installed on the server second thing you'll need is a grafana and prometheus host now i typically like to do this separately than putting it on the server you can do that if you want but i'll be hosting grafana prometheus on a proxmox lxc container and lastly a piping hot cup of coffee for more brain power of course yes we all need that or your favorite relaxing beverage of choice whatever that might be so the first thing's already been established that i do already have net data installed on my server the next step is figuring out where you want grafana and prometheus to be hosted as i stated earlier i like mine to be separate from the server itself just because i like to containerize things and keep things more organized so what i did is went ahead and set up a proxmox lxd container where i will install prometheus and grafana using portainer and docker compost decks so we're going to go ahead and go into the pertainer dashboard and add a new stack remember this is the one we created in the proxmox galaxy i'm going to name this grafana i'll have all of this information on the homelab.wiki so you can just copy and paste this as well into your web editor in pertainer and then you'll go ahead and launch the stack this will install grafana and then we'll go ahead and set it up after that well look at that grafana was successfully installed let's go ahead and check it out we'll click on it here and this will take you to the dashboard where the default username is admin and the password is admin and it'll take you to another screen here where you have to create a new password so you'll make a super secret password and you'll put it in here and then you'll log in once you get logged into grafana there's not a whole lot to see here and it can kind of look a little intimidating if you ask me because the first time i used it i didn't know what i was doing but we're going to install prometheus first before we dive into dashboards so to do that we go back to our pertainer dashboard and click on stacks we'll add another one and we will paste in the content in the web editor and call this prometheus again you can grab this off the homelab.week i'll have all the instructions in the description below the video so you can just grab this over there go ahead and deploy the stack and we will get prometheus installed i ran into a small issue here where i installed it and i'm going to check out the log so it says it stopped and it looks like here it says it can't find the file that's because we didn't create the prometheus.yml file so i will show you really quick how to do that and then we can start it back up so over here on my grafana and prometheus host i'm going to clear this out i'm going to cd into the root and i'm going to look for that docker folder where the files were placed for prometheus so i'm going to go ahead and run a cd i'm going to cd into this docker directory that you can see here so i'll type in cd docker then you can see here that prometheus is in there and then i will go ahead and cd into the prometheus directory and here i'm going to go ahead and touch a file which is prometheus.yml and this will create the file that we need for prometheus to run through docker now over here on our porting our dashboard we'll check this and click start and this will start up prometheus so we can get this up and running and that did the trick so let's go ahead and get started with prometheus you can see that it is running here and everything looks good let's go ahead and verify that the web version the web ui should say is up and running and it is so i'll check a couple links here make sure everything's connected properly looks like it is so this is good so now we can get started and edit our file but before we do that i want to briefly explain what prometheus is and what exactly it does prometheus itself is actually a pretty amazing software it doesn't require a database and all you really have to do is just edit one file to make it function properly this right here is my prometheus.yml file that i'm currently using for my servers this isn't going to be the one i'm using in the example i'm just showing you this so you know how prometheus works say for instance we have netdata and we want to send that information in here through the api prometheus checks this file and then brings it back to the web ui where it displays all those metrics in the prometheus web ui then from there we'll pull the information into grafana with that out of the way now i think it's time to talk about editing the prometheus.yml file so if i ls here in this directory we can see that file i'm going to go ahead and nano into that right now there isn't anything in it but i'm going to go ahead and paste a bunch of goodies in here right now and including the netdata scrape information so it'll give it the job name the metrics path which is the api for netdata you can see all that information here and then the ip and port of where netdata is so this is really cool it's actually going out and finding this information through the api today we'll be focusing on monitoring one instance of netdata so i figured it'd be easier but if you do in the future want to monitor more than one you can add another section like this just copy it and paste it below this section here and you'll have to modify the name and the ip of course where that netdata instance is being hosted just an example i'm going to show you i'm going to go ahead and paste this in here just like so and you'll see another section just like this but you will have to change a couple things because you can't have two of the same metrics being monitored so you'll have to change the name up here because this is the prometheus unique identifier and then of course down here you will have to change the ip of where nut data is being hosted so yeah we're going to take this out though we're just going to focus on one and then we'll go from there all this information will be on the homelab.wiki the information is in the description below when you do paste this in here into your prometheus.yaml file you will have to make sure that you change the ip down here right here where i'm highlighting so you want to make sure that this is the ip where your net data is installed because that's where prometheus is scraping that information from it's pulling it through the api just for reference we can see that it is here on this ipm port so i can compare this with what is in my yaml file make sure that it's the right one so we'll go ahead and save this control x and we'll hit y and then enter so back here in the prometheus dashboard if i go ahead and look at the metric explorer it's not going to show anything because prometheus can't see it because it hasn't been restarted so i'm going to go ahead and restart prometheus so i can see the changes that we made to that file so let's go ahead and do that now this is what's really cool about having pertainer installed on your system is you can just pop on portainer and you can just start and stop your services quickly and easily so let's go back to prometheus let's go ahead and hit that explore button and here we are if you don't see this right away just refresh your prometheus dashboard and it might pop up again it might take it a couple seconds before it actually can aggregate this information but this is all the information that's being pulled from my my server that has net data installed on it you can see all these different metrics and this is what i mean there's a lot of information here which is why i want to be able to pick and choose what i want to display on my grafana dashboard so let's go ahead and start doing that okay so i hope you guys still have some nice warm coffee in hand or whatever your comfort beverage is of choice before we start building our grafana dashboard which you can see this is grafana here is we need to actually add a data source and that is well you guessed it we're going to be adding prometheus so how do we do this it's pretty simple you just click on the data source for prometheus and we add a new one so where is our prometheus located it's going to be on my lxe container where it's hosted so let's go find the ip and port for that so here it is right here i can see the ip import in the address bar i'm going to copy this and take it back over to the dashboard and i will just paste it in right here i'm going gonna make sure that i get rid of the trailing slash and that's it you have to touch anything else make any other changes you just hit save and test if you have a data source is working you are golden and you are good to go okay so time to get into the nitty gritty of everything i'm back on the prometheus dashboard and as i said before there is a lot of information and a lot of metrics in here to sift through so i've went through and i've made things a little easier for you you guys can do different things if you want but i'm just going to start with the basics in this video because it can get really long instead of actually going through all of those i'm going to go ahead and use the finder and this will show me what the uptime is and i'm going to show you that you can actually see it in the graph here in prometheus it's not very pretty but if i take this right here and i plug it into grafana now that is a different story so let's take this and plug it into grafana back on grafana here on the left hand side you'll hover on the plus sign here and then click on dashboard this will get a new dashboard started we're not creating one from a template this is from scratch we'll click on add an empty panel and then this section right here is where we're going to paste that query that we pulled from prometheus that we just copied from over there on that dashboard and once we click elsewhere outside of that metrics browser box this chart will show up with information so prometheus is measuring our up time in seconds so the first thing i'm going to do is scroll down here on the right side and change that to the unit to seconds and it's kind of hard to find but you have to go into time and then into seconds and then i want a number and not a graph so i think i'm going to change this from time series to a stat so we've been up for 3.52 hours that's not very long there's all kinds of different customizations you can do in here but i'm going to go through and do a couple different ones i'll change the orientation to horizontal i'm going to come through here and i'm going to actually change i'm actually going to delete the threshold because there is no threshold for my up time and i'm going to keep it green i like that i'm going to change the title to just up time and i'm going to go ahead and apply that for now and we have our first panel on our new dashboard so we've been up for 3.52 hours and i did turn my server on three and a half hours ago so i can start recording videos for this and what should we do next maybe something for the cpu i think what i'll do now is go ahead and grab the cpu temperature that seems like a good one to do now so let's go ahead and go back to the prometheus dashboard and grab that metric okay so here we are on the prometheus dashboard i'll go ahead and paste the appropriate metric in here and i'll hit execute and i'll reiterate it one more time that these will be on the wholemap.wiki so you can copy and i will copy this right here i'll go back to my dashboard and we'll click on this little add panel button here and i'll do another new panel empty panel i should say and then paste that query in there and there's the chart we have there but for this one we're going to make some more changes so the first thing i'm going to do for this one is rename it to cpu package my bad package temp and that's what i'll call it and you'll see it change up here at the top and then from time series to a stat i love the way this looks it's really cool and we can change the colors and stuff later but 41.1 degrees and this is celsius so that's good too and to change that we can actually go into the standard options under units so down here it is i believe in temperature and we will change it to celsius that way you know that it is indeed celsius so we'll go ahead and click apply and that is now added to our dashboard say we wanted to go back in and edit this we can click on a little drop down arrow at the top click on edit say we want to change the color i'm actually going to get rid of the threshold because i'm not going to add a threshold to that one right now if you want to add a threshold you can it'll change color if it gets to a certain temperature that you think is too high i'm going to change this to a custom color and let's make it blue shall we something like this i like that so let's click apply on that now we can distinguish a little more clearly between the up time and the cpu package temperature if we want to resize these we can do that by dragging them left or or right or up and down to make them bigger or smaller but i am going to do this so we can add some more things and start filling this out since we're on the cpu train how about we add the cpu average percentage being used let's do that so on prometheus i will plug in the appropriate metric once again and i will execute it you can go through the browser and find them in here as well but since i already know what they all are i'm just going to execute them and my processor has a lot of different cores so there's a lot of information here but the one i want is cpu0 and i found that the user gives me the most accurate temperature for the processor so i'm going to go ahead and grab that and we'll plug it into the dashboard in a new panel so once again we will click on add panel and add an empty panel we'll paste our information in once again just like so and it gives you the same chart that it always gives you and for this one i'm going to go ahead and use a gauge so let's pull it down and we'll use a gauge for the cpu percentage i did rename it to cpu busy so you can name it whatever you want if you want to name it this you can and scrolling down here i did make another change underneath unit under standard options if you click on that and you go under miscellaneous make sure you check the percent 0 to 100 because that's the percentage it's measuring and then in the max section i did put 100 and then down here under thresholds i did add an 80 well it actually came standard with 80 so if it does reach 80 it will start to turn red instead of green so that's that's always a good indicator if the cpu is percentages too high okay all that's left to do is click apply and add this to the dashboard there we go like that let's make it the right size so i'm gonna go ahead and rearrange these i like my up time to be the first one and i like that one actually to be smaller because it's not super important like these other ones are so this is actually starting to look like a dashboard it's coming together so you can see where this is going before i get too much further i'm going to go ahead and save this so i don't lose all of my work i'm just going to call this server and we'll leave it in the general folder and click save now that we've saved the dashboard you'll actually start seeing the refreshes take place here you'll see the temperature will change you'll see the cpu percentage will change and even the uptime will change automatically now that you save the dashboard i think it's time to start showing some memory or some ram in here on this dashboard so let's go check that out okay once again here on the prometheus dashboard i'm going to go ahead and plug in the system ram megabyte average so i'll go ahead and execute that command these are the ones we have available here and the one i'm interested in right now is how much ram i'm using currently so that's this one down here on the bottom copy that take it back into the dashboard you know the drill click on add panel add an empty panel and we paste it in here just like so and we get the default graph let's go ahead and make some changes to this a few things i changed on this one starting at the top i changed it to a stat graph so you know i like that one i just like the way it looks i like the graph i like the numbers and i also renamed it to memory used here in the title and scrolling down underneath standard options under unit it is megabytes so that's going to be found under data so make sure you choose the right one there is m e there's this one here and then there's megabytes you want to make sure you choose megabytes and that's it and i scrolled down here and i did also change the base color i removed the threshold that was there and change the base color to orange because i just think it looks hot so i'm just going to go ahead and apply that change the size of this and that's how much memory my server's using right now not very much because i'm not running hardly anything on this right now just a couple docker containers and that's it okay it's time to cheat a little bit we have the memory used in here now we want to know how much is free so i'm just going to go ahead and do a copy or a duplicate i will duplicate it and i will go into edit and i will change one little thing in here instead of saying used i'll just put free just change it to free and that's it that's how easy that is and if you want obviously you change the color down here that way you don't have to go through changing everything again and uh yeah it's just easy to do it this way hit apply oh i did forget to change something here let's go back in here and change this to memory free so you do have to change that make sure you change the title before you apply it but you can always go back in and edit it so there it is memory used memory free we have our up time again i like to keep my up time up top i do want to show you a different style of chart that you can do for the cpu so let's go ahead and duplicate this one here really quick let's go ahead and do a duplicate go ahead and edit this and for this one i'm going to do a time series which is the default chart that you see when you add a new panel in here so it's always going to be a time series chart but i'm going to go ahead and make some changes and then i will show you what i did so the cpu usage flow is what i titled this one as you can see up here at the top it does look a lot different from the default flowish looking chart quite a few things were changed here actually i did enable transparent background but that doesn't matter because i colorized it anyways towards the bottom you can see that list is enabled bottom legend placement is turned on you can see the different things that i have here i'll just scroll down here so you can see the different things and when you get down here to units you want to make sure you change this back to none and remove the max and that will give you this cool looking flowish looking chart you see here once you remove the maximum from that from 100 and just take it out and replace it with nothing and to get the colors i did change this to blue yellow and red you can drop down and change that and then my base is just a standard blue color here and i did choose as filled regions and lines for the show thresholds so that's it for the cpu flowish looking chart and i will go ahead and click apply and there you go i do think this bottom legend area looks a little cluttered so i'm going to go in here and change that as well really quick and you can add whatever you want in here i'll just put cpu chart and save that and that'll make that look much more cleaner and just click apply and there it is okay so there's one more thing i want to talk about and cover in today's video and that is hard drive space i want to make a panel for the amount of space being used and the amount of space that's available in the raid for my open media vault nas so i'm going to go ahead and do that and then we'll wrap up the video and we'll have some final thoughts once again over here on the prometheus dashboard we have the netdata disk space gigabyte average so what's really cool about net data and prometheus is it allows you to actually pull in data about the file shares that you make so this is the file share right here for my raid that i have for the eight drives all which are four terabytes i'm going to go ahead and copy this and i'm gonna go ahead and paste this into our dashboard you know the drill we're gonna go ahead and add a empty panel paste this information in here just like so and we have to change our standard options unit to gigabytes i believe so that is going to be in data so we'll scroll down here to data and then we will choose gigabytes which is right here it looks like we have around 16.9 terabytes of available space so again i'll change this to a numerical value using the stat and i don't want that to be red so i will get rid of the threshold and i'm going to change this to a nice what no i'll go ahead and use blue so 16.9 terabytes of available space and one last thing before we apply this is we'll change the title of the panel to media raid disk available because this is a raid that i am trying to track here if you're just doing a regular hard drive you can name it whatever you want here but this is for available space so let's go ahead and apply this and there's our disk space so let's go ahead and shorten this and now we can do one more we can go ahead and duplicate that we will edit it and we will simply change this available to used and now it is 12.8 terabytes of used space and then up here again from available we'll change that to used just like so and we will apply it so now we know what our available space is and our use space is i've really enjoyed using grafana so far in the last couple months i've been using it i have used it in the past but never so much as i have in the last couple of months and learning how to use it has been such a treat to be able to see everything i like to monitor in such a beautiful dashboard like this i hope you guys enjoyed today's video if you did be sure to drop a thumbs up and if you're not subscribed consider subscribing make sure you click the bell icon so you know when videos drop also make sure you check out the homelab.wiki that's where all the documentation for today's video is going to be located that's going to be it for today if you have any questions be sure to drop it in the comments below bye for now [Music] [Applause] [Music] you
Info
Channel: Noted
Views: 29,112
Rating: undefined out of 5
Keywords: grafana dashboard, grafana, prometheus, netdata, homelab, self hosting, system monitor, server monitor, open source, geekedtv, grafana tutorial
Id: uimGcQVRaqI
Channel Id: undefined
Length: 21min 42sec (1302 seconds)
Published: Tue Sep 21 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.