Monitoring Kubernetes with Prometheus – Tom Wilkie

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Applause] hello everybody two companies actually last year it was causal and a long time ago his company called a kunu which was one of the cassandra' companies but yes now I am VP product at company coke with Pharma laps who here uses Befana okay not not too many who here uses Prometheus a few more so the people who are using prometheus who are not using Ravana put your hands up what do you use nothing okay interesting lie is its I think nowadays is considered co-founder to be kind of the de-facto dashboarding solution for Prometheus anyway so today's talk is going to be about Prometheus and kubernetes and then how they combined and how best to use them together there's quite a lot of content out on the internet about this so I'm going to try and deep dive into a couple of particularly useful queries and dashboards and not given perhaps a really general wide view I've got some links to some really good presentations at the end that give a much more general view so for those that aren't familiar with prometheus Prometheus is a monitoring and alerting system it is not a time series database it has a time series database in it it was built by some chaps at SoundCloud I'm quite a long time ago now but for a very long time it's been open source it's a proper community effort it's not owned by any single company I'm one of the Prometheus developers Bjorn another one of the Prometheus developers was actually in l'viv two weeks ago apparently a lot of the Prometheus developers are in Berlin but there's about 20 of us and we spread all around the world affiliated to all sorts of different companies it's really it's a really good project so Prometheus was inspired by Google's Balkman gurus Bergman is it is one of Google's internal monitoring technologies that they used to monitor their services running on borg borg is Google's internal cluster scheduler which kind of begats kubernetes so you can see how Prometheus and kubernetes have both kind of inspired by complementary technologies inside Google so what makes Prometheus cool well it's a simple text-based Metro format this has since been spun out into a new open source project called open metrics and this text-based metrics format makes it really open really easy to implement you can omit it just from a just from a bash script if you really want to and other vendors are starting to adopt from easiest text-based format which is awesome like influx Google other vendors are using it and so really we're trying to push it as the de-facto monitoring instrumentation library effectively instrumentation standard Prometheus has this multi-dimensional data model so for people who come from perhaps the graphite world where your identifiers for your time series are hierarchical strings Prometheus kind of encourages you to this label based multi-dimensional model and I'll go into a lot more detail about that in a minute and then one of the things I really like about prometheus is is query language we've not tried to copy any other query languages not try to you know model it on SQL or anything like that it's a very simple very concise and very powerful query language and will give you loads of examples of that so a couple you know people tend to get a little bit confused about Prometheus it's not for logging it's not for events it's really just for metrics really focus really simple we focused on a really simple execution model an operational model so it's a single binary you run it on a server that's about it it also works incredibly well in highly dynamic systems so if you've got a kubernetes cluster with pods coming and going and moving around and being dynamically scheduled per meteors can keep up with that it integrates with kubernetes at the service discovery level and will will track pods as they get started and stopped and new ones get launched and will start scraping the metrics from them last fun fact about prometheus is it was the second project accepted into the cloud native compute foundation after kubernetes which kind of gives you some impression of the relationship that they've got so what does what does Prometheus look like well it's a collection of components the main component being the Prometheus server in the middle and but we'll start from the left and work our way right we've got your jobs which tend to be you instrument them with Prometheus client libraries and expose metrics about your jobs behavior so we'll get into what those metrics might look like in a minute but this is the first key distinction about Prometheus in its a white box monitoring system the expectation is you're going to come along and instrument your applications so that they expose Prometheus metrics and you're not just going to measure things that you can read from the environment like CPU and memory right so the reason we're going to do white box is because we can get a lot more fine-grained information about the your applications behavior that being said there's a whole bunch of what we call exporters in the Prometheus world which allow you to gather things like CPU memory disk network you name it literally there's an exporter for everything these exporters also act as adapters between other monitoring systems and Prometheus so let's say you've got my sequel that exposes metrics in its own favorite format then my sequel exporter will translate those into the Prometheus metrics for you so then we move on to the Prometheus server in the middle so the Prometheus server will use service discovery as I mentioned so with this example Cuba Nettie's to find all of the applications in your infrastructure will go and collect to them and scrape the metrics from them and so this is the second big differentiator with Prometheus is that it's pull based it pulls metrics from your applications if you are familiar with things like Graphite's that's D that's a push based system your applications push metrics to graphite so the the reason we prefer pull based is kind of multifaceted but first reason is it's very hard to overwhelm a Prometheus server if it starts you spin up too many applications and you start to overwhelm the Prometheus server it can control the period at which it gathers metrics so it can back off so it gives us this kind of failsafe method the second reason is because we've got this service discovery we can go and gather lots of really useful metadata about the targets that we're scraping and add that metadata to the time series so we can put the name of the job obviously we can put the hosts that job is running on we can give you its IP address it's version you name it any metadata about the target that's available to our service discovery system we can push into the time SUSE database and give you a lot more context when you're building queries and there's another reason I a the third reason we like pull based metrics gathering is because you can run multiple Prometheus's because the Prometheus is initiating the connection there's no problem with having two of them or if you're a Prometheus developer like I am you can run a third one that's got your own custom changes in and test your Prometheus changes and you know if you're a monitoring vendor you can come along and run another one and have that scrape as well and because it's pull based you don't have to go and change your application every time you want to run a new monitoring system you don't have to run proxies and things like that so the pool based model is a big big part of the Prometheus culture it's also a bit of a religious thing it's a bit like Marmite a lot of people get quite upset about pull based models and well they'll just have to get over it so inside Prometheus you've also got time series database this is storing the data it scrapes and we'll get into more details in a minute and then you've also got a query engine so this is where you can send Prometheus queries it'll execute them and return you the results so this the query engine is where that laser doesn't work is where Prometheus and the previous web UI and gravano and other API clients can integrate with Prometheus and render those really nice dashboard you see Prometheus will also evaluate periodic rules these are recording rules or alerts and the alert rules when they fire will fire notifications off to alert manager so a common misconception with the Prometheus system is alert manager is not evaluating you're alerting rules you're alerting rules are evaluated by Prometheus alert manager sits there in its job is to take the notifications that those evaluations create and to group them together to deegeu them to silence them to route them to the correct end user and so generally you might see in an organization lots of Prometheus service maybe even one per team but typically only one alert manager cluster and that one alert manager is organization-wide and can deal with routing and can do really cool things the alert manager we have set up in Gravano Labs use the same alert rules to route to different people depending on whether it's in dev or in production which i think is really cool so that means when I'm on call for dev for production I don't get woken up if someone breaks death I'm actually on call right now which is why I'm looking at my phone and then alert manager will integrate with things like pager duty Vic drops email slack and and get those notifications to you so I've talked about the metrics format here's a big pager text this is what the metrics format looks like I I challenge you to find another monitoring system that's metrics format will fit on a slide but it's relatively simple you'll see metrics is one metric per line or one time series per line actually the beginning is the name of the metrics and then you've got these key value pairs which identify the labels and then you've got the value and that's it you can see how you could quite easily output this from a batch script yeah so we'll move on looking at the Prometheus data model the previous data model is again incredibly simple we've optimized for easy to understand low surface area so it really just consists of one of those identifies which is a big bag of keys and values and then a set of time indexed values and the time index is milliseconds in an integer 64 and the values are just float 64 that's all we support simplest model we can think of again some people get hung up on the fact that we don't do higher precision and we don't do other data types but the reality is this is flatly this is really flexible and it's super efficient to store so let's give you an example of some time series this is everyone who read that this is what some of the time series identifiers might look like I reformatted the slides or the black screen and so sorry about the wrapping yeah you can see how they've got a set of labels associated with each time series and those labels are different for each time series what you'll see here is some of those labels have come from service discovery and some of those labels have come from the target itself so obviously the metric name HTTP request total and the path and the status that's white box information that we've got out of the job but then the job name nginx and the IP address of the job in the instance label should be instance in our instances every time I show these slides I notice a little typo so maybe next time you'll be they'll be free of typos so those labels have been added by Prometheus to identify the target so once you've got all these time series you know a pretty standard Prometheus server will easily have millions of time series in it how do you write your queries how do you select the time series to include in your queries so you use what we call a a selector these are these again look very similar to the open metrics format which is quite useful because you can cut and paste between the two and basically say they basically allow you to select time series using matches based on those keys and values so here I'm saying please return me all the time series for HTTP requests total for the job nginx that have a status that matches the regular expression v dot okay so this will give me all the 500 HTTP errors what you'll notice in Prometheus is everything to counter we try and encourage everyone to model everything because of just a pure counter this way if we drop a scrape because we're rate limiting ourselves we're not missing information we're just down sampling and so you can see this is the total number of requests served on each of these paths with 500 as the status code since this job started okay you might think that's not particularly useful because it isn't right now and what I'm gonna walk you through is how to turn this useful this data into a more useful information so here we can see that we've had 34 queries to /home that have failed with status 596 queries to slash settings that have failed with 502 okay so I want to know I want some context I want to know if that's like did they fail when this job started or are they failing now like all I've got is the count that's always increasing how do I tell if what the rate is so effectively I need to differentiate that number and we have the ability to do that so first you can get a window over this data so by putting square brackets 1m this gives us a 1 minute moving window over this time series this is kind of gone off the slide but you can see how we're now getting a set of values and you'll notice in the values that you can see they're all monotonically increasing so 30 31 32 except for the one at the bottom you'll notice 56 106 5 and because in prometheus we know that the numbers are always increasing we can detect when the job restarted because they didn't increase and we can compensate for that so we know the rate there we can come we can calculate the rate there more accurately because we know it's gone from 106 to 5 then it's got to be at least 5 yeah we're still dropping some information anyway we apply rate to this and now we can see the instantaneous error rates for these two paths so you can see there's very few errors on home but there's a quite a lot of errors on settings and now this init in or itself is still not particularly useful it's it's kind of interesting but is two and a half errors a second a problem I mean that's really depends on how many requests were getting to the settings page you know if it's doing a thousand QPS then two-and-a-half errors it's probably not a big deal but if it's only doing two and a half GPS it probably is a big deal anyway before we do that we can aggregate these and subdivide them by path because I generally when running a nice microcircuit X we're going to have multiple replicas of each service so I don't really care about the individual replicas I just care about the service as a whole so we can subdivide by path sorry we can aggregate by path and then we can do a binary operation and divide this by the total number of requests by path and Prometheus wall on either side of a binary operation will match the labels and then we can see it's actually a hundred percent of set of requests to settings that are failing and naught point naught naught 1 so what's that point 1 percent of requests to home that are failing and so hopefully this has shown you kind of how why I like prompt you are basically how with a relatively simple language with only two or three fundamental operations we've turned what is basically raw data like the number of requests this job is served into a useful like actionable piece of information which is what percent of requests are failing by path you know and we actually use this exact expression in our alerts and in our dashboards ok how many over time ok that was my really quick introduction to Prometheus hopefully just giving you enough information to follow on what we're going to go you know so let's talk about Cuba Nettie's who here's using kubernetes so there are people who put their hand up for Cuba Nettie's who didn't put the hand up for previous so hand up if you're using kubernetes and not using prometheus guy in the back in the blue top what are you using to monitor your kubernetes okay but you're going to get to Prometheus cool so one of the one of the kubernetes founders said that basically prometheus is now the de facto monitoring system for kubernetes and that he would expect every Cuban eTI's cluster to really have Prometheus deployed in it which is kind of nice if you're a Prometheus developer anyway so kubernetes well most of you know it's a platform for managing containers and orchestrating containers like some people who like to hyperbolized talk about it as an operating system for your data center it's inspired by Google board as I said it's the first project in turocy and see if the CNC F was founded to contain kubernetes and yeah one of the things I want to focus on here is its rich object model for your applications so one of the ways to think of kubernetes is not as a not as a collection of components and jobs that you have to run and and configure and connect up which is kind of interesting but not the point of this talk what I like to think about it really is this abstract model which you need to design your application to fit so you need to decide when when building an application on top of Cuba Nettie's how you're going to model your application you know is it going to be a stateful set is it going to be a daemon set is it going to be a replication controller and a deployment replication set so you know you're gonna use the con job features and then inside a pod you know how I many containers you're gonna have per pod and what are the different containers going to be and how are you gonna compose different pieces of your application together how are you gonna deal with storing and all of this information and you design your application to fit inside this object model and because there's some semantics of your application encoded into this object model this gives us a really good start for building alerts okay and that might be a bit abstract so let's let's go into a bit more details let's say a daemon set for those who don't know is kubernetes primitive for saying I want one copy of this pod running on every single host in my cluster so you'll see this is super useful if you want to run some monitoring or some log aggregation let's say on every host in your cluster or if you you know I mentioned earlier there's a Prometheus exporter for gathering CPU and memory information it's called the node Explorer and now I want to run this node exporter and every single one of my nodes in my queue minetti's cluster so I'm gonna model that as a daemon set but on the other hand you know we we showed earlier some HTTP requests you know I've got a front-end service and I'd you know I might have a hundred or a thousand nodes in my kubernetes cluster I don't need my front-end service to be run on every single one of them only maybe need thirty or forty replicas of that to deal with my my inbound traffic so then I'm not going to use a daemon set for that I'm going to use the deployment and a replication set so I'm going to say to my deployment I want to deploy thirty copies of this version of this container and it will go and decide where to put it how many to run when they fail it will restart them and when you change the image it will do a rolling upgrade so this is what I mean by modeling your application using the kubernetes primitives so as we've got this object model we should really get stats about this object model get metrics about this object model and this is where the first job I'd like to mention comes in which is the cube steak metrics written by some friends of mine it's a pretty cool job you run it as a team as a deployment you really only need one of them and it it exposes some really interesting metrics for instance they'll expose things like the number of pods that are running you know what state each pod is in it'll expose things like the number of available pods for a deployment it'll expose the amount of space in a anoxic us-cuba and it'll expose all sorts of really information and really interesting information like number of times a pod as we started and this is really useful for encoding some of those assumptions you've made about the object model in alerts and so you can get told when your deployments you know not deploying when your upgrade of an image has failed this kind of stuff by losing the metrics in here and I'll show you how to do that in a minute so the next thing I want to talk about is something called C advisor by Google this is a job for exporting per container resource usage information so if you want to see the amount of CPU time any containers using the amount of memory is using disk and so on see advisors where you go and this is really useful if you wanna for instance in kubernetes you're allowed to specify limits on the of memory and CPU a pod can use and you might get a pages true story has happened to me last week you get a page slow queries coming into your service and you go and look at this and nothing seems to be broken it just all seems to be running slow well you can go to see advisor and using from ql you can you can compare the CPU usage of your pod to its limits and you can see that there's a pod bouncing off of its limits and then you can go and either you know add more pods to that service or you can increase its limits or you know something less and so this is really actionable information when looking at performance data so that's kind of my overview of the different jobs you can use the different integrations between the two of them I'm not gonna do a really boring walkthrough of like how to wire these all up and how to install them there's some links at the end of the talk there's great how to's great automation online what I'm going to show is with all this information what's the kind of really interesting queries you can run what's the interesting kind of alerts and so on you can build and before we just dive into kind of basic that those queries I want to give you a model to think about and so I think this might have been the talk I gave last year somewhere else I gave a talk about what I call the read method which is a model for thinking about alerting I'm gonna use three methods here the third one doesn't have a name yet first one is the use method this is where we say for every resource in the cluster I want to have a utilization saturation and error rate for that resource so utilization is generally considered to be the amount of time that resource was busy you know it's quite obvious for a CPU right but even for a hard disk this is relatively easy to measure saturation will prob is the amount of work that's that resource had to do which of course can be more than the amount of time it had to do it in so if you're looking at CPU this would be load average if you're looking at a disk this would be your i/o Q's links and errors are pretty self-explanatory so the read method says well instead of looking at resources look at services for every service in your infrastructure measure the request rate the error rate and the duration of those requests this should be pretty straightforward if you're running a micro service architecture you can export this information in a white box method alert on this you know this this relates quite nicely to user experience you know if you're you if there's high latency you'll measure hydration in your requests if there's high error rate your users will be seeing page load failures and then the final one which are kind of looted to earlier I don't really have a name for you but it's where you encode the expected state of the system as kind of invariants that you use in your alerts and this will tell you if your rollouts are failing for instance so let's look at the use method you could build really nice dashboards like this which tell you kind of how much CPU is in use how much memory is in use and this is kind of useful for getting a overview of how long's left in the how much resources you're using you want to use the node exporters daemon set as I mentioned all of these dashboards that I've got in these screenshots are open source and are reusable and extendable and configurable in something called the kubernetes mixin which is a project that I've been doing with Frederick at Red Hat and it's now becoming kind of the de-facto set of dashboards and alerts that people want to use for their Quba Nettie's clusters it's coming quite widely used I think in Google's contributor to it now so to give you an example how you would tell CPU utilization you'd use a expression like this so we can see we're selecting all the time series that a node CPU mode idle okay this is a counter that only ever increases and measures seconds per second basically well it just measures second but if we rate it so we take a differential of it it gives us the seconds per second okay and one second per second is like 100% free and so one minus that number is the amount of usage so this tells us how much CPU usage we've got and we're averaging it here because you know machines have multiple CPUs nowadays and so on saturation that's a little bit different we're going to use a node average and then the load average is not a counter in this case it is a gauge so it's just a value that goes up and down so we're not using the read method here so you're not using rate I don't know why the use method things on the end that's a problem with reformatting your slides but the use method isn't just for nodes and for hosts you can also use it for containers so you would instance eadvisor in fact see advisor is actually in it keeps changing they bundled it in - the cubelet and I think they've now taken out Kuebler but it's in most kubernetes clusters if not all of them and this will allow you to get CPU memory and so on pod but this you can actually start combining this with the metadata from cube state metrics and start doing much more interesting analysis so one of the common things you see in kubernetes is that every pod within a service will have some label that unifies them all so you might add an a clay ball that says like this is the app front-end inside Griffin labs and most of clusters that I've worked on we use the name label instead of the apt label but when you look at the pod name it tends to be different it tends to contain some some random information and a hash of the config and so on so grouping pods by their their name is not that useful but maybe grouping them by one of the labels is useful unfortunately see advisor doesn't know the labels but cube state metrics does so what we can do is we can use a query like this and this is where we like Wow mind blown right you can use a query like this to join label and CPU usage I'm not just gonna leave that there I'm actually going to give you a bit of a demo of how that works so so first off this is CPU usage by pod name and namespace and so you can see a bunch of pods I think this is yeah this is one of our internal clusters so it's not public information but it's fine you can see all the pods you know we've got a lot of pods running on this cluster and you can see their CPU usage now it's not really useful to aggregate by pod name because as you as I said like look these two pods are part of the same service that have different pod names so what we really want to do is like aggregate them together by by effectively by service name so let's let's pick out this thing here what this is doing is getting this is getting all of the pot labels it's getting all of the pot labels and each one is a time series and the value of that time series is always one okay you might think that's a bit of a weird way of doing it but what this allows us to do is multiply these two queries together and the one we'll just you know cancel out and then we'll get a time series that contains both the label name and the pod name so you can see here's all the labels on here and like there's a huge amount of metadata so what we're gonna do she'll see if I can do this without looking at the cheat sheet it's always good to test yourself so we're gonna do multiply on pod name and name space so this is telling Prometheus that the label pod name and name space is common on both sides and should effectively be joined this is like the join parameters in your SQL Clause and then we're gonna say group left label name so what this is saying is I want to take the label name from the cube pod labels and I want you to add it to the time series for container CPU usage and it worked someone think I've done this before okay but this is still you know there's a lot of information and we slammed on the final aggregation here so we're going to do some by label name mother now where's it's gone too many uh there we go too many brackets so we've now taken that final result and aggregated that by the label name and we can see now basically the amount of CPU each service is using and this point it might be useful to show a graph this is the experimental Explorer view that we've just launched in graph honor so it allows you to do this kind of ad hoc analysis against your Prometheus without having to build a dashboard it's pretty cool it's in the latest refine release which i think is 5.4 although I actually don't get that involved with katana stuff and there's some really good videos online and now we can see that the most offensive job is in gesture by CPU usage which is kind of useful to know and you can see how without maybe this is looking a bit complicated if you've not seen punku all before but this is a really simple query if you have to rewrite this in SQL it would be huge anyway so that's how you get cpu bi service with prometheus final thing read method are not final thing second to last day these are quite interesting because every single component in the prometheus in the kubernetes system all of the different jobs all exposed prometheus metrics so they all expose things like the requests they're processing and how many of them have errored and so on so here we're looking at the AP the requests going into the kubernetes api server and we're grouping it by resource and error code just like I showed you earlier that's useful but one of the things to note about white box monitoring is the monitoring tends to sit above the transport layer so you tend to use like a HTTP middleware to monitor each HTTP handler and that will miss transport layer errors so the best way to actually monitor transport layer errors is to monitor the clients and because most of the kubernetes components written in go most of them use kubernetes go client libraries they all expose the same metrics so this is really cool you can actually gather metrics from everything talking to the API server ask them if those met if those requests are succeeding or not and get transport level errors and this is probably the single most useful alert I've had when building the Kuban Hetty's cluster because this will tell me if my certificates are broken and certificates are the bane of building any communities cluster so here you can see this basically the same as the query I walked you through we're looking at rest client rest client request total for anything that's not at 200 and then we're dividing it by all of them so we've got a relative measure because I don't want to look at anything absolute and I'm timing it by a hundred because I'm including this an alert in an alert wall and I want to not do that conversion yeah this is all this is also part of the Kuban Hetty's mixin seed i have to take photos now finally the the method that doesn't have a name yet suggestions on a postcard this is where I want to build invariants that tell me if my system is behaving as designed and so like if I want to know if my deployments are working I might compare the number of replicas I've asked for so the number of replicas that they actually have which is super simple that's the first query this will tell me if when I deploy a new image that rolling update might fail and that that alert will fire saying that there is not the right number and it's a really high level alert like people always say you should be alerting on user experience you should be alerting as the highest level possible cuz it captures all multiplicative sins well this will capture so many different problems with deployments and you can do a similar thing for stateful sets the final one down there tells me if my pods are restarting too quickly you know one thing that's quite common in a system like kubernetes which await effectively babysit your pods for you and will restart them if they fail you kind of might not notice if your pod is crash a little bit you know maybe it's not crash sleeping all the time maybe it just fails every 10 minutes you might not notice that so this alert will tell me if the pod is restarting too often there's more this alert will tell me if a pod has failed to start at all you know maybe you've given it the wrong arguments and the final one down here this is how you take a relatively complicated concept like cuba Nettie's is supposed to run on multiple nodes the reason you run on multiple nodes is because you want to tolerate node failure but if you've come along to your cube Letty's cluster and given all of the jobs resource allocations and you've allocated 100% of the resources in your cluster when you have a node failure there's no place for those jobs that were running on that node to get restarted to so I want an alert that tells me if there's less if there's 100 percent utilization in my cluster basically I want a job that tells me if the free resources in my cluster aren't big enough to tolerate a node failure and that alert down there which basically says the ratio of used resources to total resources is greater than the ratio of one minus the number of hosts to the number of hosts if that's true send me an alert cuz I've over provision of my cluster so this is kind of powerful and again all of these are part of the cube Letty's mixin final shout out I wanted to mention is a project called cortex which I'm one of the lead developers on this is building a horizontally scalable highly available version of prometheus is recently been accepted into the CNC F sandbox so the hammer maintained on to CNCs projects and yeah it's pretty cool like I'd love to check it out it's it's really early it's you know it's um it works we've been running in production for a while now but it's not very not very friendly yet you know it's designed to power you know Prometheus as a service generally so I'd love feedback on this so if you want to get started on any of these and you think any of this stuff looks interesting here's a bunch of links that you could go and look up the easiest way to get started with prometheus is the Prometheus operator again by by the chaps at Red Hat it's a really easy way to install a fully working prometheus system and this includes the kubernetes mixing that I've mentioned if you want to do it the hard way do life on hard mode then you can try my case on it library this is how I provisioned Prometheus and gravano an alert manager node exporter and all these different components this is a extensible and configurable language for describing your kubernetes objects is super cool I encourage you to check it out and it's very similar to how Google internally managed their board jobs and then finally yeah the Kuban eTI's mixin this is a collection of alerts dashboards recording rules designed to work on pretty much any kubernetes cluster designed to be really flexible and will give you all of the goodness and you know and many many years of effort that have gone into designing the perfect recording alerting and dashboards for prometheus therefore Kuban at ease rather so if any of this has been interesting to you guys then I encourage you to check out the Google essary book there's a second and maybe even a third one now it's really good the read method for instance that I talked about is really just the four golden signals that Google talks about with without one of them and then if you want to go into the nitty-gritty of all of the different jobs that you want to use and what metrics they expose and how to deploy them I encourage you to watch Bob Cotton's talk from Q comm and he gives a really good introduction to all the different moving parts and goes into a lot more detail than I was able to and then finally no monitoring talk would be would be finished without a call out to Bonin Greg's website the use method is awesome and his website that describes how to use it is is absolutely fantastic and if you've got any questions about performance monitoring that's where you should go cool thank you very much any questions [Music]

Info

Channel: GDG Lviv

Views: 26,550

Rating: 4.9322033 out of 5

Keywords: conference lviv, devfest, devfest ukraine 2018, gdg, gdg cloud, tom wilkie, monitoring kubernetes with prometheus, monitoring kubernetes, kubernetes and prometheus

Id: kG9p417sC3I

Channel Id: undefined

Length: 36min 54sec (2214 seconds)

Published: Wed Nov 21 2018