Setup alerts in Grafana 10 with example

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi friends in this video we're going to talk about how to create alerts in grafana if you don't know what grafana is grafana is an awesome visualization analytics and alerting platform where you can build interactive dashboards very easily I have made a playlist on grafana and I will leave the link of the playlist in the description so in this video we're going to talk about how to use grafana for realtime data monitoring and creating alerts you can see I created a blog post on this topic and I will leave the link of this blog post in the description so I'm going to explain alerting grafana using a postest SQL data source in our previous video we have talked about how to use grafana to connect to postest SQL for time series visualization I will leave the link of this video in the description you can see that to understand how to connect post SQL with grafana so the thing we are going to achieve in this video is we will visualize POS SQL in grafana and create alerts based on conditions in our example we are plotting the temperature from a sensor data present in the poess SQL database and we are going to raise an alert if the temperature raises from a specified threshold value suppose if the temperature is greater than 30° raise an alert and if the temperature goes below 30° again then the alert will be removed again if the temperature goes above 30° then the alert will be raised so we are going to achieve this setup in this video and when an alert is raised or an alert is removed you will get a notification and we are going to demonstrate how to send alerts from Gmail and grafana will use Gmail to send alerts to the specified stakeholders so this is what we are going to achieve in this video so let's get started so before using grafana to configure alerts let's try to get to know some Concepts because without knowing that Concepts you can't use grafana to configure alerts so we are trying to achieve alerting in grafana and to understand grafana alerts you need to know about these Concepts alert rules alert States contact points and notification policies so let's start to go through these Concepts one by one so what is an alert rule so alert rule is basically what it literally means it's the rule which defines the alert for example you can write an alert to check the data continuously each minute and raise an alert if temperature is greater than 30° and if this condition is met continuously for 5 minutes then only you can raise an alert so this is an example of an alerting condition so in an alerting rule you define which data source should fet the data from and what is the alerting rule and what is the periodicity of checking the alerts and how much time the alert should be persisting to raise an alert so these are all defined in the alerting room and the next thing is the alerting states so there are mainly three alert states which is okay pending and alerting let's take our temperature example if the temperature is less than 30° then the state is okay because no alerting condition is met if the temperature is greater than 30° for 2 minutes the state would be pending because for us to raise an alert the temperature should be greater than the threshold for 5 minutes right so the alerting condition is met but still the desired duration of the alert is not completed so that state would be in pending and if the temperature is greater than the threshold for more than 5 minutes then the state would be alerting so this way the alerting rule can have three alert states which is okay pending and alerting and now the temperature again falls back from threshold to normal value then again the state will be okay state so this way these are the three alert States and obviously if the data source is not responding or if there is no data in the data source then the state would be no data and while evaluating the alert rule if there is some error the Alert state would be error State mostly you would be dealing with these three states okay pending and alerting and if there is some data issues then you may get into these states which are no data and error so these are alerting states and the next thing is the contact point so contact point is basically a notification medium using which you can send your alert information to your stakeholders so in graan you can have multiple type of contact points like email web hooks slack Discord Etc so basically the change in alerting States can be notified using these contact points and in grafana a contact Point can be one or more Integrations that means you can define a single contact point and it can have mail and web hook integration both in one contact point so basically contact point is a group of one or more Integrations like email web hook Etc and the next thing is a notification policy so you defined an alerting Rule and you defined a contact Point using the notification policy you will specify what contact points would be used by the alert rule this is achieved by attaching labels to the alert rule so these were the concepts which are required to understand the alerts in grafana alert rules alert States contact points and notification policies all right enough of theory let's try to do the real demo in grafana so my grafana is already already up and running in my local host let's try to open grafana so my grafana is running at Local Host 3000 and I'm presented with the login screen let's try to log to grafana all right now I'm present in the grafana home screen so let's try to see my data sources so in the connections I'm seeing my data sources and I've already connected a posters SQL data source to my grafana so if you don't know how to connect a post SQL data source to grafana I already made a video on that and I will leave the link of that video in the description please go through that all right the next next step is to create a dashboard to show the data from the data source right but since there is no data in the data source I will just populate some dummy data for this demo purpose in my previous video blog post where I connected posters SQL to gaana as a data source I have given a sample script for populating some dummy data into the database I also given you the schema so you can run the schema to create the dummy table and you can run this script to create a dummy data in that table so let's try to copy the script and I'm using DB to connect to my database DB is an awesome tool using which you can connect to multiple types of databases and do a lot of database Administration tasks if you don't know about DV I already made a video on that and I will leave the link of that video in the description currently Ive connected my DB to my post SQL database and this is the script which I'm going to run and this script is going to create some random data in the database and it will create the random data for the past 1 hour to the next 4 hours so let's try to run this script all at the script is run and the data is populated in the database all right since I've populated some dummy data in my process SQL database let's try to visualize here so I'm going to create a dashboard here by going to the dashboards and let's try to create a new dashboard new dashboard and add visualization and I'm going to select to the database as the posst SQL database which I configured already let's try to move this up so that we can get some space to see the code and here instead of the Builder mode I'm going to go to the code mode and I'm going to write some some SQL here to populate the data so this is going to be my SQL select data time as time and temperature as Sensor One temperature where sensor ID equal to S1 and I'm applying the time filter so Ive already explained these terms in my previous video so if you don't know how to is post scale query in grafana please refer that video but for now we are plotting Sensor One temperature in this grafana dashboard so let's try to run this query and yeah you got your data here since we have inserted the D data for only the last 1 hour this would be empty here so let's try to convert this from last 6 hours to last 1 hour and you got your data here all right we are able to successfully create a dashboard showing the temperature from our post SQL database let's try to save this let's try to name the dashboard as sample dashboard save and I got my dashboard here now the next thing is to create an alert right creating alerts in the newer version of grafana has changed a lot so let's try to see how it's done so in the left pan go for alerting and here you got alert rules I've already created some alert rule let's try to delete that already so now I don't have any alert rules let's try to create a new alert rule let's try to give a name to this alert rule I'll just name it Sensor One alert all right now we need to define the query again let's go to the code mode and let's try to Define the alert and here in the options you will Define how much period of data is to be fetched and here in the options I'm getting the last 10 minutes data which is okay for me because for alert evaluation last 10 minutes data is more than enough for me but if you want to change your time range of fetching the data you can just change it here but I'm okay with now minus 10 minutes and apply time range that's it now again let's try to paste our query to f fetch the data let's try to run the query and here is the data which we are fetching the last 10 minutes data so in this way you can check whether your query is right or wrong so we have successfully defined the query to fet the data for our alerting condition and we have checked it here all right we have mentioned the name of the alert we have defined the query to fetch the data for the alert now let's try to create an expression to do alert evaluation so my alert evaluation rule would be create an alert if the threshold is greater than 30° so just scroll down and they already given you some easy Expressions the first expression is a reduced expression and the second expression is a threshold expression so reduce expression just as the name implies you can reduce the whole series data into one number so the alas of my input query is a it can be seen right here and the input to this radius query is the a so a is being input for this radius query and that taking the last one but I would like to take the maximum value that means in the last 10 minutes I'm going to take the maximum value and the value would be here 27° you can see the preview value Also let's try to preview again the maximum value in the last 10 minutes was 29° and for alerts to work the output should be zero or one so that can be achieved using the threshold expression so the input would be this query which is the B query so input would be B and if it above 30° the threshold is satisfied so e above would be converted to 30° so we have defined that if B is greater than 30° then fire the alert let's try to preview again now it's normal previously the query was zero and obviously the data was above zero so if you preview it was firing the alert was in Alert state and now if I made the query is above 30° and preview now the state is normal so first we we defined our data fetch query and then we have defined two expressions which would get a result from the whole data fetched as a single value using the reduce expression maximum of the data and then I'm applying a threshold filter to see whether the data has crossed the threshold just with two expressions I was able to Define my alerting condition in fact people who are using some previous versions of grafana you can create a classic condition Also let's try to delete these two conditions in fact and create a new expression add EX classic condition and here the condition would be a single condition when maximum of a is above 30° the alert should be fired so set this as alerting condition in fact and Now preview now it's firing that means the data would have been greater than 30° and you can see it here the data was 30.9 so that's why it's firing so you can even create a simple classic condition or you can create reduce and threshold conditions in fact let's try to see some more expression possible days I'll just add a new expression and create a math expression and this is useful if you want to create transformation suppose you have two queries and you want to add them so if you have a query a and a query B you can even add them using math expression something like that so that would be for some complex scenarios where you would want to do Transformations on your queries that means you want to do double of your values or you want to subtract two queries or add two queries so for those complex calculations you can use the math expression but for our example it's a simple threshold matching alerting condition so I'm just using a classic condition and I'm saying that if the maximum of the data fetched is above 30° just fire an alert let's to prev this again it's still firing because the data is above 30° 30.3 if the data comes below this threshold then it would be normal all right we have defined our alert Rule and the next thing is the evaluation Behavior so this will Define what is the periodicity of evaluating the alerts that means we may Define that check for the alerts for every 1 minute or check for the alerts every 5 minutes so this is the evaluation Behavior so to set an evaluation Behavior you need to have a folder but if you want a new folder you can just click on new folder and enter the name demo folder and click create and now you're using your new folder and to define the periodicity of alert evaluation or alert checking you need to have an evaluation group where you define the alerting checking periodicity so you need to select an evaluation group grou if you don't have any evaluation group just create a new evaluation Group by clicking this button and here enter a name so here I'm going to write something like temperature evaluation group and I want to check every 1 minute so I'm going to make the interval as 1 M so every 1 minute the alerting condition would be checked so create this alerting evaluation group so now all the rules in the selected group would be evaluated every 1 minute and the next thing is a pending period so pending period is basically the am amount of the time for which the alert should be continuously violating that means you can define an alert that if the temperature is continuously violating for 5 minutes at a stretch then only fire an alert so if you want to create a alerting condition something like that then you can use the pending period otherwise you can just make it zero that means whenever an alert is met immediately the alert will be fired if you write pending period something like 10 minutes that means if the data is continuously violating for 10 minutes then only alert will be filed so for our example to make it simple let's make the pending period as zero that means as soon as the alert is met the alert will be fired and you can leave this default if there is no data the state would be no data and if the execution error or if there is some timeout you can make the state as error so it's okay to keep this as a default options and then the next step in defining the alerting rule is adding the annotations it's just basically telling the summary of the alerts you can Define it if you want let's TR describe our alert here all right we have described our alert here and then the next important thing is how can you convey your alerts to your notification channels or how can you send an email for your alert that's possible by giving labels to your alerts so let's try to give some labels to our alerts so label is basically a key value pair let's give a label team equal to infra so this is basically the convention generally used you give the tags to your alerts and based upon the tags you can route your alerts to your notification channels suppose if there is an organization and you kept the sensor in the server room and the concern team is the infrastructure maintenance team then I would keep the label team equal to infra so that is the general convention or else you can keep any label you want you can just write some key value pays here and that's it now you defined the rule just try to save the rule and exit and we have saved our rules so we have defined the rule here and you can see already the rule is normal here and the next evaluation period is within the 1 minute that means every 1 minute the rule is going to be evaluated now if you want to pause this rule how can you do that it's really simple actually let's try to see how it can be paused go to the edit button and you can see the editing screen of the rule and in the alert evaluation Behavior you can pause evaluation now save the rule now the rule won't be evaluated that means the rule checking is paused now so if you go to the alert rules if you expand this Sensor One alert this rule is paused now unless you unpause the rule the rule will not be checked so the alert is paused now so we have successfully defined an alert Rule now the next thing is the contact point so I already created a contact Point let's try to delete that let's try to create it again okay we are telling that the contact point is already being used in some policy let's TR go to my notification policy and I want to even delete this policy let's try to delete this notification policy and let's try to go and delete this contact Point all right now let's try to create a fresh contact point so our contact point is going to be an email so let's try to add a contact Point Let's TR to name it as infra team email and the integration would be an email integration and let me try to send the email to my infra team so suppose the email is infra abcd.com so this is my email of the infra team and optional email settings you can send a single email to all the email addresses suppose if I have two email addresses like infra abcd.com and infra 2 abcd.com a single mage message would be sent to these two emails at once and you can give an optional message and that's it it's that simple to create a contact point and then if you want to add one more integration in the same contact point you can do that let's go here and add a contact Point integration so here you can add one more email or you can add something like Discord or whatever so for this example let's make it simple and create just a single email address as one contact point now let's try to save the contact point now let's go to edit this contact point and now you can see all the contact Point details here so we are using an email contact point right so grafana should be able to send an email to this contact point so for that you need to configure grafana to use email in our example we are using Google's email in grafana to send emails so how can you do that so let's try to see how we can configure email in grafana so grafana is installed in my computer so we have a folder called C program files grafana labs grafana conf so in that folder you have defaults ini let's open this default. let's start right click and open with you can use any notepad but I'm using vs code because I'll will get the easy syntax highlighting and here I'm going to search for contrl f SMTP and there's a section called SMTP and this data would be used by grafana to send emails so if you're using Gmail to send emails from grafana the setting should be something like this first make enabled equal to true and and the host would be smtp.gmail.com or any SMTP host you using suppose you have your organization Outlook email you can use those settings but for this example I'm using the Gmail SMTP settings so this would be the same for all Gmail accounts smtp.gmail.com fire7 and here the username would be the Gmail address from which you are sending the Gmail and the password would be the app password let's TR to see how we can create the app password we will see that in a while and you can make skip verify equal to false and from address be the email address which you're using and the from name for example here I'm using grafana but you can use your company's name like Acme it department or something like that so let's try to see how we can get this password so log to your Google and go to accounts.google.com or myac account. goole.com or just click on this and click on your account here and then go to the security Tab and here in the search bar just search for app passwords and here you got this app passwords go here and now I got some app passwords here if you want to create an app password let's try to do it now grafana 2 I'll create a second app password now grafana to create and now I got an app password let's try to copy this copy and done now I got an app password which I can use in my grafana settings so I'll go to this vs code and paste my app password here obviously there should not be spaces remove the spaces and that's it now I can use my Gmail account to send emails now I configured grafana to send emails now let's try to close the defaults let's close this vs code Also let's close our Gmail and now since I've got my SMTP settings right I can test my SMTP so I have created integration right let's say to test this test and send test notification you can see test alert is sent and in my sent email I can see my test notification is sent actually notification test so that's how you can configure gra to send emails and you can configure contact point to send email to a particular email address so till now we have configured an alert rule we have configured a contact point and now you need to connect this alert rule with the contact point right for that you have notification policies so by default notification policy is the default policy which will use the email but I want to create a new policy so click on the new nested policy and remember the label which we have used in our alert Ru let's try to use that here I can write t equal to infra so that was a label I used in my alert so if I use this this policy would be matched with that alert Rule and that alert rule will use this policy to send notifications so the contact point which will be used by this policy is obviously the infr team email and that's it save policy now see this here if there is a label called team equal to infra in your alert rule infra team email contact point would be used so so this way using notification policy I was able to connect my alert rule with a contact point so that's it very easily we have created an alert Rule and we have connected a contact point with the alert rule to notify the changes in Alert state but hey I did not link my alert with the dashboard so that I can overlay the alerts of my alert rules on the dashboard right for example let's go to a dashboard dashboards and my sample dashboard I'm not able to see any Alert state here right so how can I link my alert rule with this dashboard panel how can I do that let's go to our alert rules alerting alert rules and in the demo folder this was my alert rule right let's go to edit this alert Rule and here there is a provision to link your alert tool with the dashboard in the section add annotations which is the fourth section you can see link dashboard and panel let's click on this and select the dashboard our dashboard was Sample dashboard and there is one panel and that panel is shown here click on this panel confirm now the dashboard panel is linked with this alert rule save this Rule and exit and now let's go to the dashboards let's go to our sample dashboard now you can see the Alert state here being shown in fact let's try to expand this and let's try to make the alert threshold very less so that we can see the alert here actually so let's try to save this dashboard again and now let's TR to change our alert rule let's go to alerts and alert rules let's try to make the threshold very less so that the alert will be triggered let's try to click on edit in fact let's try to unpause our alert so evaluation will go on pause evaluation will be removed and then we are evaluating every 1 minute right but for this demo let's evaluate every 10 seconds click on this edit and here instead of 1 minute 10 seconds save and then in the alert rule in the threshold let's make it above something like 15° so that it will F defin Ely preview this it's firing let's try to save the rule and now let's go to the dashboards dashboard sample dashboard and now within 10 seconds you should see an alert here because the threshold is just 15° let's try to reload now you got an alert saying that the alert has crossed the threshold and if you H over this you can see Sensor One alert and the value is 30.93 so this way you can overlay your alerts on your existing panels and now let's tryy to make the threshold up again so that the alert would be coming to normal state so again let's go to the alerts alert rules and let's go to our alert rule edit this sorry it's evaluation group expand this and go to this alert edit and here let's say to make the condition above 30° preview it's firing still because the value is greater than 30 let's make it 35 and let's set a preview this now now it's normal save the rule and exit and now let's go to the dashboards sample dashboard and now let's start to wait for 10 seconds refresh this again and now you are in the okay state so this way as the alert rule evaluation goes on if there is some kind of violation in the threshold or if it's again less than the threshold the Alert state would be changing and for each alert change you should get a notification let's go to our Gmail and if you see the sent mail you can see there's a firing alert which is notifying that the alert has filed so this way you can even send notifications if there are alert violations so the alert is still not greater than 35° so the alert is still okay in state so if the temperature crosses 35° then again there will be something like an alerting State all right let's go to our alert rules and let's try to change the alert evaluation for every 1 minute instead of 10 seconds so let's go to the alerts edit it and make the evaluation Behavior as 1 minute instead of 10 seconds so click on this edit button and instead of 10 seconds make it something like 10 minutes so every 10 minutes the temperature will be evaluated and now if you want to see the state history at what times the alert was filed let's go to the alert rules and if you expand this alert you can see something like a state history click on show State history and this is the state history so now if the alert goes from normal to alerting you will see a new entry here so this way you can see at what time the thresholds were violated and at what time the Val Has Come Again below the threshold so you can see the transition of the alert States using the state history so that's it guys let's go to a dashboard sample dashboard and this is how you can see your alerts in grafana now this way you can configure multiple alerts in grafana and you can get multiple notifications you can group your alerts and do a lot of convenient alerting so this was a simple example of how to create and manage alerts in grafana you can see I've created a blog post on creating and managing alerts in grafana I've given you the notes and I've also given you the configuration screen so that you can easily configure alerts in grafana for your own use cases I've also given you the link to the official demo video you can even go through this if you want to know more or if you want to get to know in depth about alerting in grafana but in our video we have just covered how to very easily set up alerts in grafana so please be sure to check out the link of this blog post in the description of this video please ask questions or post your valuable feedback in the comment section hope you like this video guys thank you for watching peace
Info
Channel: Learning Software
Views: 16,003
Rating: undefined out of 5
Keywords: taming_python, learning_software, grafana, postgresql, alerts
Id: nW5AuEtSqVc
Channel Id: undefined
Length: 27min 33sec (1653 seconds)
Published: Sat Jan 06 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.