Simple, scalable deployment for Grafana Loki and Grafana Enterprise Logs

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi my name is trevor whitney i'm an engineer here at grafana labs working on the loki and gel projects and i'm excited to introduce you today to the single scalable deployment architecture this is a new architecture that we released in gel 1.2 and loki 2.4 that we hope will help you get up and running and started with a scalable architecture of loki in less time so let's go ahead and get started we'll go to the i'm actually going to go to the gel download page and the reason i'm going to grab the gel packages is because we ship a debian and rpm package that includes system d units so it's going to help us get up and running start faster the new targets that exist in the simple scalable deployment also exists in loki 2.4.1 so you can do this using the open source loki and if you run gel the enterprise loki without a license it should operate just like an open source loki um so you should be able to follow along um with this sort of exercise exactly if you want to um so to do this exercise i am going to have two machines um no kubernetes no orchestration we're just going to do this as if we were on bare metal to show you how hopefully illustrate how simple this deployment architecture is even if you're not using kubernetes now behind the scenes what's going on here is i'm actually running a docker compose stack uh it's giving me a grafana it's giving me a minion which we're going to use for object storage a little bit later it's also giving me a min grammar flog image which is hitting my loki using the docker compose loki logs driver and so what i'm actually operating in here is i have two containers so on the left side here is going to be the right container well that's confusing isn't it and on the right side is going to be the read container so i've gone ahead and downloaded that rpm package and so i'm going to run this script so what this script does is it's going to do rpm i to install it's going to change var lab and var lib enterprise logs that is the directory that the rpm installed loki and systemd unit are going to be looking are going to use um for um all their for their work directory and then it's going to call system cuddle enable enterprise logs so let's go ahead and do that okay so the this um installed sort of a few things of note there's the actual um in this case enterprise logs binary there is a configuration file which is at etsy enterprise logs config so this is the config that ships by default this works this is fine for running a single binary instance of loki and then we also have the um environment runtime environment file so this would be a etsy sysconfig enterprise logs if you are on a you know fedora or centos system if you are on a debian system this will be etsy defaults and so you can see here um those are some environment variables we set including the config file runtime config custom args is where you can specify runtime flags and that's where we will specify targets later on but for starters let's just get going with a single binary so i just installed it and then uh it's i've enabled the systemd unit so i'm just going to go ahead and start it and then i will um tail the logs great and so we're up and running it's spitting out some errors right now about an empty ring so as i mentioned i have a min grammar flog instance using the docker compose driver that's already been spitting logs at this address and previously there was nothing there to handle them now we brought up a loki but the ring hadn't stabilized now the ring has stabilized and those errors have stopped so let's go ahead on over to our um to the browser and we're going to um go into the grafana and log in and um i'm going to go ahead and have some some browser history here that's a preview of the future right now it's on the loki uh simple scalable uh data source so the only difference between these two data sources is the single binary one is configured to read from from the right node um which is the one that is acting as our single binary right now and we'll get to the simple scalable later but here we go uh put it on the loki single binary and we see that the logs are coming in these are just garbage logs from the min grammar flog great so that's single binary so let's go back to the terminal and um let's let's do better okay so single binary is a great way to just spin up a single instance kick the tires maybe throw some like localhost logs at it practice log ql things like that but it's it's difficult to scale because you've coupled the read in the right path and so not only do those paths often have to scale separately um but you now have created a situation where if you have a really heavy query load or or a bad actor on the query side it could actually cause you data loss and take down your right path and similarly a problem with the right path could take on the read path so what the simple scalable deployment model does is it splits write and read it doesn't go to full micro services because we're trying to keep things simple so the goal is to just have two targets that can be scaled independently and that depending on the size of your environment should be plenty robust enough to handle many production loads so how do we get there well the first thing that we're going to do is we're going to go into our config so that's enterprise logs and there's our config and um i wanted i want to set ourselves up for success for when we scale um and that is i want to provide a member list config so when we start scaling um when we start uh when we start scaling our our deployment um 7946 um the various right nodes and the various read nodes need to talk to each other because the distributors need to find out where all the injectors are and the query front-ends need to find out where all the query schedulers are and we use a component called a ring to do that and the ring needs a key value store back end the simplest one is member list so member list uses the gossip protocol to communicate this information between the various nodes so we can just add member list here and that way we don't need to add another dependency so what i've done is i've gone ahead and added the member list config and now i'm going to come down here and change the common ring config so common config section was added in gel 1 2 and loki 2 4. um what it does is it allows you to just define your ring configuration and your storage configuration in one place and it'll it'll replicate that config to all the places that it's needed uh and then the last thing i'm gonna do i'm not sure if you caught this and the logs um but um the current load that i'm getting from my uh sort of dummy log system is uh higher than the default and the default being three megabytes um so i'm gonna up this to five megabytes just to quiet that log stream uh down a little bit okay so now we've gone and added member list to our write node um so it can now cluster um so the next thing that we need to do uh is change the runtime um sort of environment arguments um to specify that the target um so that's in cisco config enterprise logs so we're gonna come down here when you don't provide any target the default target is all so single binary which is um what we were running um previously and so what i'm going to do is i'm just going to turn this into a write node okay i'm going to go ahead and restart it now so system cuddle restart the enterprise logs service and then we'll go ahead and tail those logs again and here's our empty ring that should resolve itself so i'm going to flip over to the read node now so just so we we installed ran single binary saw that at work we turned our single binary node into our write node and now this other container that's just been chilling out over here we're going to turn that into our read node so i'm going to run that same install script cool now i need to do a little bit of a hack here and that is because as i mentioned i'm doing this demo inside of containers and in order to get systemd working inside of containers i had to mount my local um c group and there's interesting things going on with the sort of uh systemd bus in that if i don't rename this service it uh will conflict with the the one running in the other container so again if you're running this on vms you don't have to do this step i mean containers so i do so um i'm i'm going to disable system cuddle i'm going to disable the enterprise logs service i'm going to move enterprise logs uh sorry i need to be in systemd system i'm going to move the enterprise logs service to enterprise logs i'm just going to call this enterprise logs read and that's just a little hack that i need to do because i'm inside of containers again you won't need to do this if you're running on vms or bare metal so let's go ahead and enable the read service okay so i need to uh modify the default config so again that's an etsy enterprise logs config and i'm going to add the member list config um oh and this loki 7946 what's going on there um i will show you in just a second so i have a dns entry set up um so that all so that both of these containers um are behind that one dns entry loki and that just makes it really easy to set up member lists just one thing that you have to add to that join members um and then as instances join and register themselves with your dns a record they'll get added to the ring and and same when they leave so that's a quick little trick to make member lists a little bit easier so we went ahead and changed the config now i'm going to go um to the sys conf this is again where the environment variables are and make this uh target of read so again right it's on the left just to not be confusing read is on the right and uh now i'm going to go ahead and start my enterprise logs read service and we will go ahead and tail that okay so we have our two services working so we should be able to go back to the browser now um and i'm gonna change my target here my my data source great and it ran that same um query against log gen and we now have more recent logs and these are now coming off of my read node um and so the read and write are working together so this is great we went from single binary we now have separated our reading our right path no the only thing left to do is to scale those well it's going to be hard to scale right now because we are using file system so if we go ahead and look at our config our storage is file system and so the next step is we're going to add an object storage back end here um however migrating from a file system stacked storage to object storage could get a little messy your indexes are going to get out of whack your sort of bolt db active um directory uh could get a little confused so uh what i'm gonna do is i'm just going to uh nuke these two containers i'm gonna reboot and um sort of reset up the cluster from scratch but using object storage as the storage back end um and so i'm just going to cut real quick and i'll bring you back after i've i've gone through that reboot all right i'm back so here i am again right is on the left and then the read is on the right um and um i'm gonna go ahead and add object storage as our back end so i need to go to the etsy enterprise logs um config and so um here we have member list um as our um memberless config we have a ring configured member list and what i'm going to do is i'm going to take out this file system storage config and i'm going to replace that with an s3 config it's using minio here we go so that this is just another service that's running in my docker compose stack obviously i don't recommend you using mini my nail for uh production um but um this will be fine for this experiment um so again so this is our right config over here on the left so just double checking that that looks good again only have to define the storage config in the one place in the common storage okay great and then just to remind ourselves here's the sysconfig enterprise logs with our target equals right okay so let's go ahead and spin that up and tail it okay now i'm going to come over here do the same thing for my read node um so again let's make sure that this is member list is correct and then we have the ring store member list right there perfect um i'm going to go ahead and add my object storage config there oops that was the right indentation these need to go in one i'm gonna get rid of so no longer using file system okay and let's go ahead and that looks right let's uh remind ourselves of this so target equals read great so let's start this one up and tail it okay so we now have a right and a read both backed by object storage um let's go ahead on back to our grafana instance and um see like see how we do so our grafana instance because we restarted everything so i have to log back in again okay so uh unfortunately it looks like there's still a little bit of cruft left um from our running in um file system so uh i'm gonna go see if i can fix that and i'll be right back and i'm back so we can see here now that um there's there's still just a little bit of cruft in the var lib enterprise logs folder once i cleared that out we can go ahead and run this and we can see down here um if we go back to our terminal that we're getting so here's the the read on the right and the right is on the left and these are now i'm min io as their backend for object storage so they can be scaled independently and we now have a production ready architecture for running loki at scale so uh thank you so much for taking the time to watch this video uh i do hope you give out this give the simple scalable deployment model a try and um please um let us know any feedback that you have thank you so much

Info

Channel: Grafana

Views: 533

Rating: undefined out of 5

Keywords: Grafana, Monitoring

Id: o34a9HVBgx4

Channel Id: undefined

Length: 20min 25sec (1225 seconds)

Published: Thu Nov 18 2021