Guide to Grafana 101: Advanced Topics & Pro Tips

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
thanks everyone for joining us my name is astara I'm a developer advocate here at time scale and you are here for today's session on advanced tips and tricks for time series in crow fauna so this is the fourth and final session in our series on guide to grow fauna 101 so some of you may be here from the beginning from session 1 where we started off focusing on how to build awesome visuals in grow fauna session tru was about getting started with alerting and session 3 variables and templating today is the final session where we're going to tackle some of the advanced practices of dealing with time series data in grow fauna this is an interactive session so please feel free to ask questions at any time using the Q&A feature in zoom so you'll see there's two buttons one for Q&A and one for chat please use a Q&A feature for questions if you just want to let us know what's on your mind or if you want to respond to things you can use the chat for that but for questions use the Q&A feature and you can ask questions at any time and our technical team of dashboarding experts are going to be on hand to answer them at the end of today's session one other reminder that you will receive a recording of the session afterwards with code and links to resources and things like that so if you want to share that with your teammates or go back and run through the session yourself that'll be there for you to do ok so before we get into the roadmap for today it's only do a quick introduction of myself just so that we're we're not strangers so my name is off dar i am a developer advocate at time scale originally from south africa one of the things i really like to do is to use technology in to empower people so i'm glad i get to do that every day at my job as a developer advocate i also write about the things that I learn in the new things that I'm learning every day new technologies on my website after calm as well as on Twitter so if you want to give me a follow there you're more than welcome to so on to today a session today's session is going to be all about at tips and tricks for dealing with time-series data in Griffin ah so the session is broken down into five parts so firstly I'm just gonna do a quick overview of the demo that we're gonna run through the demo is actually split it into three parts so part one of the demo is going to be about time shifting how do we compare what's happening right now to previous intervals that's the basic problem that time shifting is gonna help us solve the next thing is going to be automatically switching between different aggregations of our data so say we might have a one hour a one minute and a one day aggregation and I'm going to show you how you can actually automatically switch between all three of those in the same query so that you can get more efficient drawdowns into your data and then lastly we're gonna deal with this issue about templates and alerting a lot of you may be familiar with coroner's limitation of not having alerts on templated queries i'm gonna show you some different ways of how you can get around that and then lastly we'll leave you with some resources so that you can take what you learned today and apply it in your work and in your projects as well as have some time for Q&A we can answer your questions so at any time if you have any questions please put them in in the Q&A feature and we'll definitely get to them at the end of the session today okay now that we know where we're going let's get started with the first part sorry that's the wrong thing let's get started the first part of the demo overview so what exactly are we gonna look at today as already mentioned the theme for today is advanced tips and tricks for handling time series data in grow fauna I'm going to show you a variety of how to's as well as workarounds for situations that may have tripped you up in the past and the reason why we wanted to do this session is because you know once you've learned the basics of visuals and alerting and templating they ask the common issue that can be frustrating and I'm gonna show you how to overcome three of these in today's session okay one other thing that I wanted to mention is if you are in need of brushing up on the basics unfortunately this is not the session for you if you're looking for or getting the basics of creating awesome visuals alerting interpreting in variables you can go to this link where you can find the previous three sessions that we did about the basics of Gryphon on this session is really for more advanced and more nuanced problems that we're going to solve if you're a beginner and you're looking to just get started with co-founder quickly I'd recommend you check out these three sessions that we did previously on visuals alerting and templating so that's just some expectations setting for today's session and then going into the scenarios that we're going to use so we always like to use real-world scenarios and real-world data sets when we illustrate anything here during our time scale technical sessions so there's two scenarios that I'm going to use in order to show you the power of combining time scale and griffons together so the first one is around taxi ride monitoring so this is a classic IOT use case where these vehicles have devices on it and we can track you know the activity of the various taxi rides that are going on and I'm gonna use this use case to show you how to both times time-shifted graphs as well as to get efficient drill downs on your data where we automatically switch between different granularities of aggregations of your data so that's at IOT use case we're going to be using the taxi ride monitoring example and then in scenario 2 I'm gonna use a more classic DevOps scenario of micro service monitoring where we're gonna show we're gonna learn how to set up letting on dashboards that have variables and templates ok so that's the overview of the scenario today IOT and DevOps something for everyone depending on on what your use case might be so let's get into the first part of the demo which is building time-shifted graphs so again just a reminder if you have any questions please put them in the Q&A the queue and hit the Q&A button and and ask them and I'll either answer them during the session or at the end okay so let's dig into time shifting so as I mentioned in this part of the demo what we're gonna do is learn how to come how to create time shifted graphs where we compare a metric the value right now to a previous time period and this is going to be using Postgres and graph honor okay so as I mentioned the scenario that we're going to use for this and then the dataset that we're gonna use for this is the New York City Taxi ride so this is data from January 2016 and basically what this does is we're going to be querying a data set that records the taxi ridership over a period of time so in this case over one month and this data set also associates the time series data with metadata some of you may be familiar with this because this is a data set that we use in our hello time scale tutorial if you want to play around with this more you can check that out at the link on the screen Cheers DB dot Co forward slash time scale taxi so that's the use case just to give an understanding of the underlying data basically we're looking at taxi rides over time in New York City over the month of January 2016 okay now onto the problem of time shifting the issue here that you might often run into an Inca fauna is you know you want to compare activity of a metric at it at the current value so activity of what's going on right now with that value of the metric in a previous time period so for example you might want to know for something that happened today what was the value at this time one day ago two days ago three days ago so you're basically comparing an activity in period t with some activity in previous periods t minus 1 t minus 2 etc and we're actually going to learn how to build this exact graph where the goal that we want is to at any point in time be able to compare what's happening right now with what happened in the past super easily and the two queries that I'm going to take you through today is how to compare today's rides with the prior two days of so again using our example of taxi right and then we also want to compare the rides that are taking place today to last week so at any point in today I want to be able to see the rides from the past two days how that compares for the past two days as well as how that compares to last week so that's just some orientation for what we're going to see in a moment so there's different ways to do this and the first way to do it would district be create separate graphs and you could have even have a graph for week one and week two and and juxtapose them but the issue with that is you know it's kind of difficult to compare it's side-by-side because these are two different graphs and you have to constantly kind of switch between them and it also requires maintenance and manual updating so it's not automatic you're not actually using you're not automatically setting the time periods and so if you manually create separate graphs for each week then you know this is a approach that can work but it's kind of suboptimal the approach that I'm going to show you today is using time scale DB and Postgres as the datasource and we're going to use this join lateral function that Postgres has in order to do time shifting really efficiently and effectively and this approach has a bunch of things going for it so the first one is that it makes it very easy to compare trends because we'll have the trend lines on the same graph so it's very easy at any point in time to actually see how does the activity today compared to the previous days or the previous time periods so that's what we're gonna show and it also Auto updates you can set it such that you know you write this query once and it for any as long as you have connection to your data source you will always be able to see how you compare to previous weeks so it actually saves you time in the long run the only con is that it requires kind of a trick you carry but that's not really a con because I'm actually gonna show you how to do that right now in graph ahna okay so let's get into our graph on a dashboard what I'm using right now is hosted grow fauna on timescale 12 and we have it connected to Postgres and basically connected to time scale which runs on Postgres as the data source okay so what we're gonna do now is just take you through the problem and I'm going to show you the end state that we want to reach for each of these graphs and then we're gonna build it step by step okay so the end state so let me just zoom in on this one yeah so this is a graph that shows the time time shift for the past three days and if i zoom in on a period of time let's say this period of two days let's get calculate I can see you know given the different colors how how that how the activity of the rides today compares to the activity of the rides at the same time one day two days and three days ago and we have these nice colors that differentiate the different days that I'm looking at and so the the that's the end state that we want to reach let's see how we can get there so the first thing we're gonna do is create a new panel and let me actually move this panel into this row that I have here so we're going to create a new panel and as I mentioned the data source that I'm using here is time scale which is an extension on top of Postgres for those of you who are new to time scale it's a time series database that is built on top of Postgres that basically is post grades post press for time series so it combines the power of relational relational database with the scale and the performance that you usually associate with a non relational database or no sequel database so we're gonna select this post grades data source taxi DB that's what all my taxi data is and then I'm going to manually edit query so once here is I'm gonna use this time scale I'm gonna use the Postgres lateral join function and that gonna allow me to actually plot the graphs for these different time periods so what I'm gonna do is just paste the query in here that I'm gonna paste a query in the end then we're gonna run through it line by line so this is can seem a very complex query at first but let's run through it so firstly what I want to do here is just at the top of the the topmost select statement is just naming the the series as well as at a high level selecting the time and the right count from this sub query that we're going to where we're actually going to get the different ride values for the previous days so what I'm going to do first is construct a graph that shows me comparing the the rides that have taken place today to the rise that have taken place in the past three days so the first thing we're going to do is select the time the ride count and then I'm gonna just define exactly what to name my various series so in this case I've said okay when the step is zero call it today otherwise just call it - the interval value and cost it has text and we're gonna call that metric that basically just tells graph on what to what to put in the legend down here so as you can see - interval we have minus 1 days 2 days 3 days that's basically what what that statement means then what we're gonna do is the the secret to this query is to use this join natural function and what does a join natural function do for for those of you who might not be familiar with it basically these lateral joins are kind of like a for each loop in Postgres so what it does is that it makes the results of the query before the lateral join so in this case what I'm doing is I have a sub query to generate the intervals so I want interval of 0 days one day two days three days it makes the results of this sub query available to the sub query after the join lateral the joy natural statement right here so essentially what's happening here is that um I have a sub query to generate the intervals so what I'm doing here is I'm selecting this thing called step and the interval I'm creating a interval so a step date and then I'm using the Postgres generate series function to say okay from 0 days to 3 days give me each of those those possible values so the outcome of this is basically 0 1 2 3 I can access each value that is returned from this sub query to create the intervals such that I'm gonna have a sub grade to select the ride the rides for that particular period of time and so what I'm going to do here is I have another sub query so I have a select statement and I'm basically going to bucket the time that the rides took place into 15-minute intervals and usually for when you're using time pockets so for those of you not familiar with time bucket it's a special sequel function in time scale that allows you to bucket your data by arbitrary intervals and so in this case I'm using 15-minute intervals and usually you'd have to specify the name of the time column so my time column in this case is called pickup date time but the trick here is to add the value of the interval so interval is something that we defined up here in this case it'll be 0 days 1 days 2 days you want to add the value of the interval and it's going to do this for each value of the interval each of these you know 0 1 2 3 so it's that's what it's gonna do so we're gonna cost that as a timestamp and then we just want to select all the rides and call that right count and we're doing that from a hyper table called writes and then what we're gonna do is actually in our where clause what we're gonna do is subtract the value of the interval from the time to plot it so for example because we want all of these on the same graph this is what actually allows us to get them all on the same graph such that even though I'm looking at it a time period for today I can actually see the values from a day ago in two days ago in three days ago and that's because of what we're going to what we're gonna cover now so what I do here is that I say when the pickup date time is between the time so so in this case we have the graph on our macro for time from and what I'm gonna do is I'm just going to subtract the in from there and again from the time to subtract the interval from there so for when the interval is zero days it's just gonna be the time that's selected in the time pick at the top here when the interval is one day we're gonna subtract one day from both the from and the to two days it's gonna be subtracted from the both are from and the two and again what that allows you to do is to have all these lines on the same graph and to allow you to make this comparison such that at a point today you can see the values from two days and three days ago because otherwise if you didn't have this it would actually be plotted that the values won't you won't have you won't be able to do direct comparisons like be I able to hear and then lastly we're just gonna group by an order by one which in this case is the time and then finally because we're doing a lateral join in this case so lateral join we need to select something to join on so I'm going to call this query L and we're gonna join on the values for which it's true so what that means is that where we have where we are able to compare both one two and three days then you should only show those lines I'm going to show you an example using the example of comparing one week ago where you won't because of the the data range that we're dealing with which is only one month in this case you actually might not have B you might not be able to do comparisons because data doesn't exist for that time period so untrue basically means you know whenever we have data that can show us the past one two and three days plot the lines for that period of time and then that's the query and then I'm just gonna edit this panel title to say a three day time shift because that's what this is and we're gonna save and exit so you can see right now what we've done is just replicated this graph that shows that the time shift for the past three days such that at any point we can actually compare how the rides were doing to hide oh how the Wrights did three days ago and again this is something super useful especially if you're doing analytics and want to know at a glance how are you doing compared to your previous weeks or your previous days or previous months or even previous years this allows you to see that straight away and allows you to separate optics from seasonal things that are seasonal where you know at this time of year you usually expect an increase and allows you just separate like you know variables and you might not have considered so what I'm gonna do here is just get rid of full so that we just have the lines and so that it's a bit easier to read okay so that's the first example as I mentioned so what we've done is we did this example of here we go we compared today's rise to the prior two days right and now what I'm going to do is just compare the rides from today to the last week and I'll show you how to modify the query that we have in order to do that so essentially what we want is to create this graph right here which shows us the rides that have taken place today and compares that to seven days ago and in this case we have 14 days ago but I'm just gonna do the one for seven days ago and as I mentioned you know sometimes you might not have data that is able to be shown so in the first week because I only have data for one month we can only see rides for today and then after one week I can then compare my rides for today for how the riser went one week ago whereas you know in this first week because I don't have data for the previous month that's why I don't see this but if you do have that data then you won't experience this this lack of data or you you won't just have one line on the graph you'll have all three lines in this case okay so this is the the graph that we're going to create that time shifts by a week instead of a day and I'm actually going to create another panel in order to create this okay apart is my mouse is acting really weird so once again acting taxi TV as a data source good ol Postgres and time scale and then what I'm going to do is again I'll copy and paste this this query in for you and then we can go through it line by line so the first part of the query is the same it's basically defining you know what Mitch what the what the name is supposed to be it's the exact same line of code that we had in the previous query the only change here is that we actually changed instead of the interval being days it's now going to be one week so in this case you can see in here I am I am concatenating step with week instead of previously we are concatenating it with day and because I only want data from the past week so I want to compare how we're doing this week to last week I'm only gonna generate a series between 0 & 1 obviously as as I mentioned if I change this value to 2 and we let it refresh I'll then compare it between one week cool and two is two weeks ago so you can play around with the series that you generate in order to see how far back how many intervals previously you want to compare to and then we use the same trick of join lateral in this case so we define the intervals that we want to compare to so in this case it's going to be one week ago and then we do the same trick of adding that interval value to the time column so we're going to be either looking at time from today or one week ago in this case and then because we want all of these just show up on the same graph the day time needs to be between the time from minus the time interval and the time to minus the time interval that's so that we can use the graph on a day time picker here at the top and get accurate results and then again group by an order by time and then we're joining on true and as I mentioned you know if we select a time period that is more than a week we can actually see where this join on true becomes important let's give a good run a minute to refresh I'm going to show you in a minute how to actually speed up crow fauna so you don't have to wait too long for your queries to refresh in the meantime let me give this panel a title or one-week time shift and then we are good to go okay so as you can see over here the reason why we join on true is because you know in this case there's false values we don't have values for seven days previous to this because you only have one month weather data whereas once we do have that we do actually get the values for the previous week so that we can compare them directly okay and as I mentioned this time shifting is super valuable if you're doing analysis and want to do comparative analysis really quickly and it also tells us if our up ticks or down ticks are seasonal or we can we can have we can more readily judge if they're seasonal or if there's other variables that we might need to drill down into deeper and so just as a summary of this section we actually saw how to use natural joins in order to do time shifting in order to enable this easy comparison of your live data so you can imagine the situation where data this is actually a monitoring dashboard and looking at how many rides have taken place today and at this point in time if you actually compared that very easily to at time periods in the past so that's the power of time shifting and that's how you can do it with Postgres and time scale so that's demo one on time shifting we're gonna switch gears and go into demo two on automatically switching aggregations but before we do that you know if you have any questions about time shifting things that we just saw or want me to run through the query again we can do that just one mention is that I'll be putting the code for the queries that I use on github and you'll get a link to the repo in the follow-up email that will allow you to take that code and then run with it and modify it for your own use so you know don't worry too much if you didn't understand every part of the syntax you'll be able to play around for with it yourself or you can ask me a question and I'll explain a bit more okay so now the second part of today's session is going to be about this idea of auto-switching aggregations so what is this all about essentially the goal of so what we're going to do in this part of the technical session is we want to be able to efficiently drill down into the data with fine granularity and we want to do that really quickly and so one of the best things to do in order to speed up your dashboards in grow fauna especially using time scale and Postgres is to not query the raw table or the raw data and the hyper table itself but to use aggregations and those aggregations are going to be on different granularities so daily hourly military depending on your needs and so what we're going to show you today is how to automatically switch between these different granularity aggregations such that you can drill down most efficiently while getting the granularity that you want without consuming too much CPU power okay so the again the basic solution to this would be to create graphs for each separate granularity or to use the finest grained granularity for everything and obviously this is easy but the issue there is that you know you can have a bunch of graphs that show the same thing just that different granularity levels so it's duplicated and then it also is CPU intensive and can slow drawing graph on and as we just saw you know I was waiting there for about a minute for something to load no one likes to wait for the dash to reload so this is something that you can use in order to make sure that stuff like that doesn't happen okay so let's get into the solution that we're going to be talking about which is to use time scale and Postgres in order to switch the aggregate being queried by using Postgres Union all feature Union or function rather and as I mentioned this is more CPU efficient and you can actually get the fine-grained data only when you need it and when you want to look at things at higher granularity or you want to look at things that are more zoomed out perspective you will then query less fine-grained granularity aggregates and once again the query can be kind of tricky at first but I'm gonna take you through it and show you exactly how to do it right now okay so let's get back into our graph on our dashboard okay alright and as I mentioned you know you could have a situation where in this case I have two panels set up one showing me so they're both showing me the same thing they're showing me the rides taking place in January 2016 one panel is using a daily aggregate so I'm querying this aggregate called rides daily so this is a continuous aggregate that I've created in time scale DB and I have another panel that is showing me the exact same thing but from an hourly aggregate so rides on E and the issue here is you know while I might have say if I have a something that on a drill into on a specific day if i zoom into that time period I don't actually get the granularity that I'm looking for if I'm just looking at the daily rides and then I have to go and switch over into my hourly rides which gives me more more fine-grained granularity but I don't really want to have to do that switching so let's see how we can make that happen let me just reset the time periods and the ideal case is something like this where you want something that at a high level it shows you the daily number of rides that have taken place so it shows you the general pattern but when you want to drill down into the data you can just select an arbitrary time period and then you'll actually get more fine-grained data so in this case I'm seeing the hourly data and then if I want to drill down even more say to what's happening in this period here I'm actually seeing minute by minute data and in this case what it's doing is switching between different aggregations of daily hourly and minute II and we're gonna see how to make that happen okay so let me just once again change this time period and we're gonna create a new panel in order to see how to make this happen okay so let's drag this panel down nope sorry my mouse seems to be wrestling with me today so what we're gonna do is drag this panel down and edit okay so now that we have this panel in place we're gonna select taxi DB once again as the datasource and now time for the query so in this query as I mentioned we are going to use the Union all function in Postgres and let me show you the query that I'll explain exactly how that function works okay now just selecting that query and here it is so at a high level what we want to do is select values from all the different aggregates that we have so I have a daily and hourly and then I'm just going to use the raw data for the minute minute levels you could also have a monetary aggregate but I just didn't set that up in this case in order to show you how to do it and then what I want is when I'm when I'm querying data in a specific interval length I want to use sudden aggregation so in this case the rules that I've set up is I'm gonna use daily aggregates for intervals greater than 14 days then I'm gonna use hourly aggregates for intervals between 3 and 14 days and then I'm going to use the raw data or the minute intervals for intervals between 0 and 3 days so that's at a high level what we're trying to do and so let's look at a code that makes that possible so the first thing we're gonna do so this looks at minute let me actually okay so the first thing that we're going to do is your basic select statement that selects the time and the right count and the metrics in this case it says daily from this rides daily continuous aggregate that we created and then we want to do is in the where clause we want to say okay if the time from - the time - is greater than 14 days then use the time basically use this time filter value of day so this time filter basically we need to give graph on the the time column in order to use such that it knows which which way to find the time values in order to plot on the graph and in this case we're saying you know when the the the interval is greater than 14 days we want to use the day the day time column here because we've selected time as a day here okay then what we want to do is you know normally we create three separate graphs of these things you know have one daily one audio one managee but the Union all function allows us to just as the as a name suggests have the union of all these different queries and then what we've done is because we've specified where to use which time columns as - as in Griffin as time filter mackerel it'll then automatically switch based on the interval that we've selected which time column it's using so we can see here again in the hourly aggregates the the first select statement is the same we're just selecting our as the time column right count and hourly as a metric from the aggregate call rides hourly and then here where the time to minus the time from is between 3 and 14 days the time filter time column is our so in this case we have our as the value that Agrafena is using for the time filter and then again we're going to do last Union or query here you can have an arbitrary amount to them you can change the gather all your different aggregation granularities and here I'm going to use the raw data from minute intervals so between four lengths between 0 & 3 days and here I'm just gonna use a sub query because we're gonna have a group by in here and I don't want it to conflict by the overall order by time that we're gonna do at the end so over here again because I haven't pre aggregated them in time in time buckets I'm gonna use once again the time scale sequel function called time bucket to aggregate things in 1 minute intervals count the number of rides and then select the minute as the metric that's gonna show up at the bottom here and then say you know where the right when the time to mine is the time from is less than 3 days we want to use this pickup date time as the time filter time we're telling gravano that hey this is the the column that you need to use for the time filter and we're going to group by 1 which is x we're going to group these results by time and call this minute and then lastly order everything by 1 and so what does it get you let me just turn on the points so that we can see more clearly on the graph and then I'm gonna call this switching Asians so let's save this and see if that works so what we've done is we can see here there's a time period that's the whole month so here we have values every day of the month and then if we zoom in to something that's less than two weeks so let's say that amount of time we have hours every hour and notice how the metric has changed from hourly from daily to hourly and then if we have so let me actually put the query in view if we have data that's more fine-grained than the three days let's say just the one hour then I can actually see minute by minute values and again the metric that's being shown you as minute so it's easy as chaining together several queries with the Union all in order to get this automatic switching between different aggregates as long as you define the time periods with which to use or the intervals with which to use the different aggregates in this way clause right yes over here we said when it's greater than 14 days used daily when it's between 3 and 14 years hourly and when it's less than 3 days just use them when it monetary aggregates which we've just done on the fly from the raw data okay so that's actually how to solve this problem of making graph on a faster when you're doing these kinds of drill downs and at the end of the day what we saw is let me just reset this time what's tricky that I'm I'm actually usually when you're when you're using graph on are using it for live data but because my data is from 2016 this particular data set I just have to reset the time all the time so that it so I can see at a high level what's going on so what we saw in this second part of the session is how to use the Union all function to automatically switch between different aggregates so the different aggregate views that we're querying so that we can do more efficient drill downs at any and get the the most new data at every level that we that we're looking at and that actually makes co fauna much faster okay so that brings us to the end of the second part of today's session now we're going to switch to the third and final part of the demo which has to do with templates and alerting so this part of the demo uses a data set that is a DevOps data set and that's monitoring microservices in kubernetes and so at a high level what is happening in this data set is that I'm scraping metrics using Prometheus and I'm storing those metrics in timescale DB for long term storage and analysis and essentially what these what matrix am i scraping so i have all these different micro services that are consisting that make up my shop so if for those of you who want to learn more about how this the set up of this actually works you can check out our guide to graph on a templating session we're actually going to more in depth about how to how I'm actually creating a template for a custom some microcell is set up that I have and I want to create templates to look at the different metrics that I have in two different shops composed of different micro services okay so that's the overview for this session so but basically just know that I'm looking at metrics from Prometheus and they're getting stored in time scale DB and the problem that we want to solve here we want to look at how to work around is that we can't actually use alerts on templated queries in graph on ax so I just put a screenshot of a github issue that shows people are still asking for alerting support for queries using template variables this has been open since like 2016 this is the problem if you if you use you find out before or if you do decide to use the alerting functionality in it you'll run into the sooner or later and so what I want to do for you today is show you some ways to work around this limitation and explore the pros and cons of different approaches we're gonna look at three different approaches to be exact okay so let's go into the first approach so the first thing that you can try and do to overcome the inability to have alerts and templates together in grow fauna is to separate your alerting dashboards and alerting panels from your exploration panels and dashboards you essentially create separate panels for each environment so let's take a look at what that would look like so let me just switch the graph on environment that I'm in so what I have right here is a template to dashboard so let me just show you I'm just having one metric that I'm looking at here which is the max memory used by my Redis Redis microservice so this is the cache and I have two different shots that called demo1 and demo2 so if I select demo one it'll show me the value the max value for demo one and I have another shop called demo two it shows me the max value for demo true and if I select all it actually shows me the max value over both the shops okay so that you can see that this is dashboard is is templated if you want to actually see how to construct this query and and how it works you can check out the template in whether or not I did but just to show you that you know this particular dashboard is actually templated and the issue here as I mentioned is you know because I have a template I get this error that template variables are not supported in a lot queries in graph owner so I have no option but to go back and I'm trying to work around this so option one is actually to create separate panels for monitoring so what I've done is in this panel I have this panel for exploration and in this row I have these two separate panels one for the demo environment and one for environment demo to then I'm gonna then put alert queries on so because these are just normal queries I've had to basically look at each of the environments that I have and create a separate panel for them I can then actually create an alert on each of these ones and alerts then function as normal if you want to get into the the nitty-gritty of how alerts actually work in grow fauna check out our second session on alerting ingre fauna where I take it actually take you through what all of this means for the sake of time I'm going to skip over that and assume that you already know how to use a lot but if you're not sure definitely go and check out that that second webinar that we did on that okay now as you can see over here I have these two different panels and one for each environment and so that's how you that's the first option that you can actually have is to actually separate out your monitoring panels from your exploratory panels that you have here now this option has some pros and cons as with all of the approaches that we're going to see in this part so the first one is that you know it's a simpler implementation in the sense that it's conceptually simple to do but and the other the other Pro is that it's also easy to identify the values that are triggering the alert because you have one panel for each value of the variable so in this case for each host I have a separate panel and so it's easy to identify the values that are triggering the alert because you know you have an alert on each panel and so if an alert kit gets fired you know exactly the panel or the query that's generated that a lot the con here is probably the biggest con is just it requires so much work there's so much of duplicate work that you have to do as well as duplicate maintenance so if you change something in your expert in your exploration dashboard you're going to change it in your alerting dashboard as well and so that's the first one and then secondly you can just get really tedious so in this case as I mentioned I only have two values that I have over here that this variable can take demo and demo 2 and then I have all what if you're dealing with a case where you actually have like ten of these values where you're monitoring in this case ten different shops or you have ten different hosts or just values that this variable can take you're gonna have to create ten different one of these panels and it's just going to be a lot of a lot of work to do it and as I mentioned here it can just get tedious if you're dealing with let's say for more than five variable values so that's the drawback for that approach the second approach is something that one of the gryphon our team members actually recommend themselves is that you can actually use alert rules and turn them into something dynamic by using regex and wildcard queries so it says the alert rules can already be made to be dynamic with rejects and wildcard queries so the next part of this exploration is to show you how to actually use wildcards and regex in your queries so let's take a look at this what I'm going to show you is a couple approaches so the first one is just using a general wildcard and the second one we're going to revisit our old friend the lateral join in Postgres in order to show you how to use that in order to get around this limitation of not being able to alert on template queries okay so let's get back into our dashboard close up this row and open the second row for wildcards so the only thing that I've done here is I still have the same query let me actually go to this query quickly so you can see what's going on this is a sequel query so despite these being Prometheus metrics they're actually being stored in time scale and I'm training them using sequel so all I'm doing here is that I'm selecting the the values to be in one minute intervals because I'm querying every 10 seconds or scraping every 10 seconds and getting the maximum of this value because I want to know the max memory used and in this case because of the way that the data is set up I'm querying a hyper table called Redis memory used bytes so each of these metrics have their own table and I'm doing a join on this Prometheus under a different schema called Prometheus series and I'm doing a join to get the metadata associated with this metric of Redis memory use bytes and part of the metadata is this kubernetes namespace which actually shows me the different ID this is one namespace called demo and other names - and in this case what I'm doing is I just use this most basic wild card which is a percent operator here or the percent character and that basically says okay any of the namespaces get me the values for them so in this case what this does is it essentially the same as using the all operator here so over all the namespaces it'll show me the value by the drawback again here is that if I want to look at the values for each of the namespaces or for each of the values of the variables I can't do it at least the way my query is structured right now one way that I'll show you how to incorporate what we learnt using R at natural joins is you can actually use the lateral joins here in order to iterate over all of the environment names so in this case I've just selected the time the value and the environment the metric and then as we saw previously lateral joints allow you to define things before the lateral joint and have those things accessible to the query after the lateral join so before the lateral join I've defined a query just to get all the environment names so I'm just selecting the name space as in for environment from this name space created hyper table where the name space is has a the prefix demo so in this case I have demo in demo 2 and I'm calling this GN so that's the query before the lateral join and then after the lateral joint I have the same query that I used before where I'm selecting the one-minute intervals and the max value and then the only difference here is that in my where clause I have the name space equal to n so n versus this variable that we've selected she has a name space and then for each value of M it's going to generate a different line on the graph so in this case as you can see demo and demo - you have separate lines on the graph being generated and we just close this up by joining on true ok so that's how you can use natural joins to get each of your different series showing up on the same graph so each of your different values of your variable showing up let's look at the pros and cons of this approach the pros is that you know this is easy to implement once you know about natural joins and it also shows you all the values of the different series in your alert notification so in this case as I mentioned we have the different values you have demo in demo 2 and so because they're all different series on the same panel you can actually get their values when an alert is triggered on that panel so in this case I can see demo has this value in demo 2 has that value but the ton of this is that you can't actually see which of the series triggered the alert it just tells you what Alette is and what the values of the series are and you have to do the math or you have to know the context in your head about ok what is the the threshold value and maybe go to a farmer to check that out you could actually play around with the message that graph on a display so obviously this is a lot most of the time you'll be receiving alerts in some third party platform other than grow fauna but you could play around with the notification message that's sent in order to have the threshold value in the alert value here to overcome them so the the cons as I mentioned is you can see the values for each environment or value of the variable but you can't immediately tell which one of the values causes the alert okay and then option three is that we use something called hidden queries for alerting and these are basically untempered queries that you can use to then define alert rules on so what what I'm going to show you now is this third option so wild cards we saw and again just to show you that we can actually have alerts on this I can create an alert and this functions like a regular query even though I'm have different I have some amusing wild cards in this case too to mimic templating and here under the hidden queries in this case what I have is I have three different series I have the first one which is actually a templated query where I am using a variable name here you can see the syntax for two indicator variable and I have two of these other queries that are actually hidden so you can see here the eye icon over here is to disable and enable the query so if I enable them you can see them here demo in demo two but I've actually hidden these and what these are is for each value of the variable I've created a separate query and now when I go to my alert panel alert section of the the settings I can actually create an alert but I need to take care of creating a lot on the panel with the right letter so in this case I can choose B and C but not a because my uncompleted my interpreted hidden queries I'm the query B and C whereas my templated queries on the a so in this case I can actually go under B and I get I don't get an error I can select you know Y and put in a value here but if I select a I get this error that says template variables are not supported a lot crazy so that's a way that you can overcome this by using these hidden queries and you hide them so that you know when you're doing your exploration the graph still works as you'd expect it to but when you're actually doing your alerting you actually have these you're able to have alerts on this graph and again the the value of this is that it simplifies your operations pretty significantly because you have exploration and alerting in the same place but the cons is that again your implementation time is quite a lot you have to create one alert per your your you have to create one alert for every value of the variable that you have and then you also only have one alert per panel so this is actually something that's pretty pretty important to notice is that you can recreate one alert per panel so in this case even though I have two crazies that I that I want to alert on I have query B in this case where I must say let's say that is above like a D in this case the value and then when I I can only have one allowed per panel so I have to have another query that says oh the average of the value in query C is above 80 and so again the issue is going to become it's gonna be either query B is the one causing the alert or query C but I don't actually know which one is is causing the alert and so even you know when I define the alert message and stuff over here I'm not gonna know ahead of time so I have to come back here and see exactly which value of the variable is causing the alert and that the restriction here is you can only have one alert per panel and not per query and therefore you can't actually see which value of the variable is triggering a lot okay so that actually brings us to the end of this part of the session and brings us to the end of the demo just as a quick reminder we saw how to resolve the problem of not being able to query on template on to alert on tabulated queries in graph onna we saw three different approaches the first one was separating your exploration and your alerting the second one was to use wildcards in your queries and using things like lateral joins to capture all the variable values and the third one was to use hidden queries such that you can stop alert and have your exploration in the same panel at the recap what we did today we looked at three different things that are advanced practices for doing time series in Griffin on the first thing we saw is time shifting which was a way to do comparison of what's going on in your system to previous time intervals in the same graph the second thing we saw was how to do efficient drill downs by automatically switching between different aggregations of your data and then lastly I showed you some handy workarounds for how to set up alerts on your template - dashboards so what's next for you how can you actually take this knowledge and apply it to your own projects the first thing I recommend is check out our advanced CRO fauna tutorials you can do so at this link and it'll also be sent to you in the follow-up email so that dad-dad then you can take the queries and you can take the things that I've talked about and apply them to your own projects if you'd like to see the set up of the DevOps scenario that we used we actually use the time scale Prometheus adapt and helm charts that gives you a three line installation of time scale graph on ax and Prometheus in your kubernetes cluster you can do so at this link on our github the third thing is join the time scale develop a slack if you have any questions about time scale or how to use times the graph fauna there's people like myself the time scale co-founders as well as other time scale engineers who are Griffin are contributors where you can get help and learn from other members of the community and then if you're looking for the fastest way and the easiest way to get started with time scale you can check out time scale cloud and we're gonna give you $300 in cloud credits to start off your evaluation you can also use those cloud credits to finish the tutorial and to do other tutorials on time scale if you just want to play around then and then lastly share your feedback and ideas with us for webinars you know we really put a lot of time and effort into these and you want to make sure they're as valuable as it can possibly be so you could have topic suggestions or things that we can improve check out this link TSD be d'arco for - webinar - feedback and then if you want to brush up on your graph on a basics and check out the three previous sessions that we did on graph on ax you can check out TS DB taco for time scale webinars and that'll take you to all the previous webinars that we did including the graph wanna ones and then lastly do join us for our next session that's going to be happening in August on five tips for improving your Postgres in sub performance so if you're using Postgres we talked about today about how to use fana for visuals but you need some way to get data into the database in the first place so I'm going to be discussing some insert performance tips for you the RSVP link will be in the follow-up email thank you so much for spending time with us today I really do appreciate it and we hope you've learnt a lot I'm gonna make some time to take some questions so if you have any questions about anything that we talked about other Griffin or questions that you were hoping to cover but we didn't please put them in the Q&A right now and we can answer them
Info
Channel: TimescaleDB
Views: 5,696
Rating: 5 out of 5
Keywords: time series, data, database, timescale, timescaledb
Id: bnxSRyF2fnc
Channel Id: undefined
Length: 61min 18sec (3678 seconds)
Published: Wed Jul 15 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.