How to collect logs with Fluentd

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

you probably know what is centralized logging and you want to build log stream pipelines we already had an episode on film beats today this episode is dedicated to 3d welcome to is it observable the main objective of is it observable is to provide tutorials on how to observe a given technology today's episode is a part of the kubernetes series but it's specific to logging in the logging theme we already had one episode presenting loki and prom tale we did another episode on fluent beat today we are going to focus this episode on the big brother of fluent bit fluently so let's see what you're going to need to learn out of this episode so first we are going to do a presentation on fluentd we're going to see how to parse files how to deploy fluentd in a kubernetes cluster and we will then create a tutorial mixing fluent bit and fluency with the presentation of lindy like explained in the episode on flynn bit flindy is the sister or brother of inbit similar to fluent bit it is composed of several plugins so you have naturally the input plugins to collect logs output plugins to export logs and many plugins that will help us to filter parse format and weigh more so let's see input plugins they are made to define several sources of extreme so in our environment we also mentioned that before we have several components producing logs so database servers of course applications that will write logs into std out or maybe even to an http endpoint we have the web server logs the load balancer logs and we have more so infinitely the major advantage is that there are so many plugins especially in the input plugins so you should clearly pay attention to through in these documentations where you will see all those plugins so tail is probably the one that you are going to use heavily tail will read logs from a file http plugin in this is the plugin that will basically receive logs from applications and then you have forward forward will be utilized in our example forward is built to receive logs from other lock stream pipeline coming from between bit or 3d you have other input plugins like tcp udp syslogs executions and way more so when using the input plugin you will see that it's important to use tags tags is a concept that is will help us to design and build our pipeline in fact if you pay attention to the flynn bit episode we briefly mention the notion of text so what can you do with tags basically depending on the source of our logs the structure could be different some some application for use json format the other pure text or xml so therefore with the help of tag we can basically apply the right parser the white filter to the data source that we are trying to transform you can also precise the destination of your log stream based on tags let's say i'm collecting kubernetes logs and those logs i want to send it over to dynatrace or i want to send over the logs coming from the hr proxy to elasticsearch so tags will allow you to precise what to do with your lock screen you can of course because it's a lock collector and for water you can transform your logs through using filter parser for mother plugging so we'll see that once you start play around you will see the value of all of those plugins now there's also we mentioned output plugins so our book is basically you we want to send logs to uh an endpoint to a locked stream storage or to another fluency for example and there are so many plugins in the flindy community so you can store it in database kafka elasticsearch uh sd out sd out is a plugin we've already used it with bit it's useful when you have to debug http and more one of the main advantage of 3d is the large number of plugins built by the community that is making friendly so much powerful and popular i'm pretty sure that flynn bit will have a similar journey with not many many plugins but as of now fluently has obviously more plugins so if you don't find any input or output or from order or whatever plugins related to what you have to achieve then check out the community because potentially there is already a plug-in built by someone in the contrib repository the other powerful feature is that fluid has a prometheus plugin that allow us to expose metrics related to a lockstream pipeline in a prometheus format i mean similar to foreign bit of course you can extract transform collect a metric and then expose it also in a parameters format but at least 3d is also giving us naturally the option to expose specific observability metrics related to our python for example we can export metrics using forward output plugin and use prometheus to generate counters reporting the number of records that we have forwarded you can see that you can also add labels to your prometheus metrics so in this example we will receive logs from another fluency or fluent bit pipeline because forward is the objective and we could basically transform the data and here we do a prometheus so we are going to build a counter counting the number of log stream forward so let's see how to parse a file so let's take an example of a useful use case so let's say we want to collect logs from our hr proxy ingress controllers i mean we mostly everyone here in the communities world use either service mesh or ingress controllers so it makes sense to also report a couple of indicators so why first of all do we want to look at logs because hr proxy potentially is from is exposing metrics naturally in a prometheus format usually in those logs there is a lot of more details and maybe we could take advantage of details so let's see how we can parse the hr proxy logs the logs of hr proxy looks usually it looks like this so there are a lot of information you can see we can recognize some dates some process id and more so if we look pay attention to the structure here there's the process name here and you have the process id the severity the ap address the port the date and time the server and then a couple of statistics like the number of bytes the active connections uh the connection to hr proxy connection to the back end and the server connections and so on so which means if we know this structure and we we can recognize that structure we can easily imagine a regular expression that will match the right pattern so this is the regular expression so i'm on going to d in details in what this expression means but typically i'm extracting the details at that justice crime so at the end we can build a lockstream pipeline collecting the logs using the input plug-in tail because it will read from a file and then uh we will format the logs using our regular expression so we extract the various contents we can tag it so then we know that it's come from ah proxy and then we could basically structure by adding some times some labels and we can forward somewhere else so here is an example so we use a tail uh input plug-in we precise of course uh the the path to the to the log file uh the positioning file as well remember we mentioned about positioning is very important in case uh your fluent d agent just uh kill is killed you can it will once you will restart it will look at this position part then we use format format we'll basically we look we expect a specific format our logs and then if it's the case we can tag it will tag with hr proxy tcp and then we can also add a time format to this luxury so if we want to forward these logs to study out to test we could add the following structure to our pipeline just a simple type std out i don't have to do much it will be sent to the logs of our lockstream pipeline to fund d so how to deploy fluency in a kubernetes cluster so 3d in a kubernetes world will be deployed as a daemon set so it means one pod per node and each of those nodes will collect the logs from the various nodes then of course we will have to build our lock stream pipeline by adding the context the structure uh where we want to forward it what is the input and so on so this lock stream pipeline in reality is a configuration file so most of the deployment that you can find over internet option d is using still this configuration file so the most recommended approach is to transform the usage of a configuration file into the usage of the config map much more easier to maintain to deploy and so on in the case of queries we already seen it with the first episode on loki all our pods are generating logs by sending the details in a cd out so we saw that the docker engine takes what is available in std out and generates a standardized location to collect the logs so if you use blending in a kubernetes environment so we will probably use tail to collect the various logs generated by our pods if you use log fluency in your cluster you will have to add the following part in your lockscreen pipeline so type tail the path to the containers the position file and you will have to also adjust the time format and of course similar to what we explained before you should definitely tag your source to then easily transform it so if you're going to use flindy in a kubernetes cluster you are going to use this repository fluently then the community's name and set in this repo you will see that there is different type of deployment based on the various plugins available so you can see that there is elasticsearch and many others but note that you will have a specific container image that will include some predefined plugins which means the current repository is providing predefined image on a couple of plugins so elasticsearch clown watch forward uh gcs uh greylog kafka and kinesis isoblob logley and many others if your target plugin does not provide any image you will probably have to build it by yourself so i would then i recommend to build your new fluency image by an existing image built by the community in my case in my example because i need the plugin for word because i want to chain flint with infinity i will build my image from the before the image containing the forward plugin the idea is to make sure that you have enough plugins in your image to achieve your pipeline so try to figure out what type of actions you want to achieve to transform your logs to collect the logs and of course to send it somewhere just to identify the right plugins once you have identify those you can build your image by installing all those plugins step by step so in our tutorial we'll see i will walk through the journey of building your own custom friendly image you will see it's very easy nothing much to to deal with but then to allow flindy to run in a stable environment we will have to make sure that we give the root permissions to a couple of folders so slash slash var slash log we need to read rights uh because this is where we're going to collect the our container logs then we are playing we mentioned before with a positioning file so we need obviously the permissions to write in that folder the objective of this tutorial is to install 3d and collect logs from our cluster and forward it to dana trace this will be our exercise so to achieve that there are several steps that will be required of course we need dyna trace tenants so you can start trial then i will see that to receive a logs in dynatrace there is a log ingest api and the login just api in directrace is exposed in an active gate so you can either install your active gate in a bare metal technology like a vmware or server or you can also deploy it in a kubernetes cluster and today what we're going to do is going to deploy this active gate in our cluster so like every episode that we produce for is it observable there is always a github repository with all the steps that we are going to follow for this tutorial so here you can see the github repository uh for the episode uh related to fluency uh so uh similar to the previous episodes we are going to utilize a kubernetes cluster so in my case i use a gka gke cluster so if you want to use mini cube or launcher or any others you are free to do so uh there is nothing special specific related to google uh it's just that for me it's easier um so i won't go through the steps of building the cluster of uh deploying istio deploying uh prometheus and graffana so they are in in the the as a requirement in in this tutorial so you will have first a clustering required and then we will deploy istio because we're going to use istio for doing this as a service mesh in the cluster and then once we have created the deployed istio we will deploy as usual the hipster shop and we will up create the right uh getaway to expose the hipster shop out of the cluster we do the same then we deploy prometheus the latest version of prometheus um and that has of course grafana and to make sure that we expose grafana out of the cluster we will also create the right rules in istio so once that is uh fully uh done then you can jump to the step where you will basically install the nd and all the exercises related to this tutorial so if you have not started if you don't have any dyna trace tenant it's not a big deal so you can go to datatrace.com slash trial and you can put your email address here start a free trial and uh in you will receive an email with your your dedicated tenant uh so what is important so i'm gonna show you here i have a a tenant which is uh in in the dev tenant it doesn't change much at me may have more features but what is important here is um in the top here you see there is a url uh what is important is you need to uh from from your trial uh you will have a live tenant so uh uh an id dot live.data.com so keep note that id somewhere because we need it later on on the on during this tutorial then in the tutorial there's two things because we're going to deploy the active gate like i mentioned uh the login just api of dynatrace is exposed on the active gate so we will have to deploy an active gate so to do so there is several way of doing it so here you can either go to the deploy than a trace on the left side menu and you will see here there is an option to say install activegate to extend monitoring so you can do that but keep in mind this is going basically to uh suggest a deployment on windows linux communities of course for in your case but um you won't go through this ui to deploy it i have a manifest file to deploy in kubernetes but uh basically here if you want to do one on on the next servers hosting somewhere else or on a docker image whatever you can do so uh with the linux or windows installation two things are required to deploy flindy in that tray so uh the first thing is we need what we called um to create tokens and one of the token is the uh the platform has a service token so in the integration section of dynatrace and you will see that there are we will have to create the android's api uh token later but the one i need first is the platform as a service so go to that section and generate a fresh new token note that down because you we will need it uh to deploy 3d uh so uh this is will be required and the second thing that we need so just to we is the dana trace api token so i already have created here you see i have a fluentd token available if i expand it uh the if i can create a new one if you want so the only rights that i need to ingest logs from fluentd is obviously the right with on the api v2 and i want to ingest blocks so that will be the minimal right for this use case uh if we want to extend further and include open telemetry traces you will have to also enable that right and if you want to ingest metrics there is also a right for this one so if you uh but in our exercise uh i will basically just one um right with in just logs will be enough right so that's the first part so start a new fresh new dietary standard build generate your platform as a service token and last generate the data trace api token then we will need to build the docker container using using fundi with the dynatrace output plugin so now that we have our token our pass token our dynatras url and everything the second thing that we need is to generate the docker container so like explained here uh in the flindy uh repo there is a there there is a specific project for deploying kubernetes flindy in a kubernetes statement set and you will see that there are several examples so uh you see with debians we'll with including several plugins here so either in x86 or in arm 64 so there is plenty of various image here so in my case what i decided i wanted to have a multi-arc image and i just wanted the forward so i pick the forward plug-in here the image related to the forward so and then there are a few examples but we are not going to use those ones because we are going to basically include the the config map change slightly here the deployment so to the next step that we need is once we have the image that of a reference we need to build the docker image uh that will include our data trace image or plug-in i would say so just to let you know so uh in the dyna trace uh oss or oss stands for open source uh there's a flim bin flint plugin uh and there is a specific folder here where you explain how to deploy it there is is not providing any image but i will show you here and i have prepared a docker files in in the repo or available in github as you can see i'm taking the base image with the forward plug-in and then i just run the few uh installations to include the ruby rubby packages so i need that trace uh i want uh the kubernetes metadata um and uh and yeah so i'm installing all the verse plugins so in your case if you need any specific other plugins then uh you will have to do the same thing so to build the doctor container it's very easy so i'm going to open a common line which is here i have the common line here you can do it with a cicd process you're very free about this so i'm going to jump into uh the folder having my docker image um docker file so it's in the flindy uh folder uh let me check i got my dockerfire heal so the only thing i need to do is to docker build uh then i got a dot because that will reference to the local docker files in that folder and then i need to put some uh some tags so we can imagine to name it like is it observable and then we can say down at your flindy planet trace and then we can put a version so it says 0.1 here it is so here i'm just building this and you will see he will get the image install all the various uh packages uh in the in the ruby format so now i have that fresh new container and that has been built the only thing i need to do to take advantage of that fresh new image is to push it to a docker registry so in my case i will push it to dockerhub i already have done it by the way we will first use nd to collect the logs and forward it to dynatrace now that we have our container and our various token the only thing that i just forgot to do uh is to get the cluster id because we need to define the cluster in that trace once the the cluster will be registered through the active gate so i'm going to just copy that that's a line here to get the the my cluster id so you can do the same thing on your hand i got it and now i can note that down that that id because i will need it later on um this is required to deploy the actual gate and so this is a required step so don't forget to do it as well so now that we have our cluster id and the old tokens in our image what we will need to do is to deploy uh the a service account so i've already in the flindy folder i have a service account in manifest file that will technically create the data trace namespace and include all the the the cluster rule and the wall binding uh for today we'll create a series account of course but for this specific namespace so let's deploy this one so that will create our namespace in that trace so let's deploy this file here so uh in it located in the fluency folder like as mentioned before here here we have the service account so we just have to do a cubes cattle apply on that file and i will create our namespace and parsers account done so now we have our circus account created so now you can see this next section we are going to need all the various information that we've collected so far so your environment url like i mentioned it's going to be um so the without the https and http you in in your data choice tenants you have an id dot live.dianetrace.com so that would be your uh your environment your app and then you need the cluster id uh your uh your from kubernetes with the help of the command that we just executed a few seconds ago we need the path token the api token and we need the environment id remember the first section of your uh dynatrace url is the environment id so run those commands and then we will be able to edit um to create secrets in the the the right uh dietrich namespace because we will need those secrets uh and once those secrets has been created we will update uh two files the fluency manifest files and the fluency active gate so let me briefly show you those two files so then you can understand what's gonna do inside of those files so the first one is the active gate so the active gate like i said is the components that will be running in your cluster and they will be able to ingest the log match uh the the various lock streams and send it back to the entries so another thing that we need here is uh you can see the the active gate image need obviously to point to your interface environment and it's using a couple of uh secrets uh and config so uh you need to to refer to those for sure uh you need a cluster id as well you will see that so that's why we need to create a config map so i'll show you in two seconds the configmap i'm referring to in friendly so the advocate gate is mainly referring to your environment id and if we jump that now to the the fluency manifest you can see here i have the obviously the my config map so here instead of a fluent de flint d configuration file i have it included in a config map and i have two variables side of that uh have the cluster id that will be replaced by my set command and i have the activegate url which it's gonna be static because it's this the the the image that we deploy but we still need the environment d here to be replaced so this needs to be done through the set command here is our pipeline but we will see it in two seconds uh and at the end we use uh the exported data trace so this is gonna be seen in two seconds um so let's uh run those set commands uh and once we have executed those commands i will apply and deploy lindy so now that we have deployed we created our secrets and updated our deployment files so the flint manifest file and the active gate we just simply have to deploy them so we could cut all [Music] apply minus f and in the directory there is the first let's deploy the fc gate all right so this has been created now that we have deployed this one let's deploy now there and the demon set [Music] all right so let's have a look at the spaces okay so let's see our parts running on in the uh diametrix namespace we have our three parts running so the active gate and the three friendly parts one per node so let's take one and see the logs that we are currently having now uh on this one so we already have a pipeline the pipeline is basically so here what it says sent to dine trade so it seems to to run um it was the first errors we had is was related to the the active gate that was not uh officially started but now it's since the theater started now uh the it seems that there is an interaction between friendly and penetrates so that's done the only missing part that we need to do so the last step that we need is to connect the ftk to the dyna trace tenant in fact it's already connected it's just that you can see it it's registered so you can you will have to collect surround those two commands one is to get the api url and the other one is to get the beer token related to the api so you will have to go to settings cloud virtualization kubernetes let's do this so settings uh where is it here settings uh cloud and virtualizations and kubernetes here it is and uh we can add a connect to a new cluster so if i do a collect to new cluster so you'll have to put your api your token url the token and enable a few of the options to let the nitrous monitor automatically and now i can connect so now by doing this you will see it's connecting to the api server and oh um my mistake i uh you need to remove the ssl handshake it's a demo so it's not a production environment but you should you should also uh enable disable this one so i'm gonna connect so now i've been connected and now let's see uh in two seconds all right so now we have our our new tutorial cluster that's been added here so now uh if you want to see uh logs are obviously coming in so what i would could do here to see the logs um i could basically jump into the lock section so here in the money there is a lock sections so to see the logs you have to click on logs here so here is the log viewer screen and now because i have two clusters running um i have logs coming from those two environments so i may need to create the right filters to see only the logs related to our or our tutorial so what i would do here i will uh look at the um the notes so here are is it observable so let me grab those ones here it is now i only have the logs from this uh those notes um and those has been added the last uh 30 minutes so this is obviously the one coming from flynn bit so let me just confirm so here you can see the log stream and you can see that here it's obviously our cluster that we just added recently so uh ingesting logs with flindy works very very well so no major issue you see it's very easy so then if you want to take advantage of this you can obviously create the right filters extract more data and then create metrics out of those logs to create any dashboarding uh similar to what we did the other day with uh with loki so hey before we jump to the second exercise uh let's let's see uh in details the pipeline that we were that we are currently running here so this is the the pipeline uh that is currently running so it's it's basically here as a source uh similar to what we explained before it's connecting tail and in the logs from the container logs from the on the nodes and i have two parts um to basically one is to check on on json format um and then we also have the expressions to uh to uh yeah be able to extract the uh stdio this is the error and so on once if for all the the the log stream matching the tag because we it's been tagged at the beginning uh it's it's been tagged it's been tagged over here so then if you match with this match here you can see that we are removing uh the row in front of it we are renaming the message into log the streaming to stream we are doing multi-line ingestions and we're putting some limits we then filter uh some few things uh we add uh the communities metadata to that one and then uh we also add more details so for instance we're adding the kubernetes cluster id we needed to uh to register properly the information in in dietrich so there are a few um you can see dimensions related to that trace that we're we are grabbing automatically through these sections and last the only thing i do here is i'm sending over the log stream to dietrich so very simple pipeline then i would like to explain how to chain front bit and friend d using the forward plugin so at the end we will have two distinct pipeline in this exercise uh one with friend bit so thin bit will be there to collect the logs from kubernetes parse it and then it will just forward it back to flindy the 3d pipeline will receive the log stream from flynn bit and just forward it to dynatrace so this is just an example to show you that you can chain both technology all right so now last section of this tutorial is uh we're going to deploy fun bits so to do that we will have of course to delete our 3d um uh deployment that was uh basically collecting everything we will we will deploy a slightly different one so first thing we need to do is to create uh though falling lines so let's create first up let me grab this cube cuddle crate namespace let's create a namespace for logging so namespace has been created now let's apply the three files that will create the cluster role the service account and last the cluster wall binding so those have been done all right now those has been deleted now what we need now is to also uh we have a new deployment file uh in this uh repo so let me show you the file that i'm uh referring to so it's there is a dedicated twin bit folder in the in the repo and so there's two things there is the manifesto fun bit uh that will include of course uh the the the the pipeline for flindy so of course we need to of course update it to uh point to the right cluster id and everything so the pipeline is a bit different so i just collect uh the on the the the stream coming from the forward and then i i do the the the rest piece the rest of the the of the pipeline where i added this stuff so basically from the difference between the two others i just collect from flambe to do and the first piece where i was just the source and the rest is on on the friendly so let me show you now the flint bit so the fan bit you can see here uh there's few set sections so i'm collecting uh the uh the i'm collecting the logs from cube kubernetes uh it's been tagged uh i uh and then i uh i add few stuff there with the api and i forward it to the the service so here find the logging done trace because it's in design trace namespace svc so i'm referring to the to the service related to the fluent the new uh deployment so let's update those the two deployment files uh i mean at least the one with 3d because we need those and then we could apply uh the fluid d manifest because it needs to be running and next the flynn bit so let me update the file first file has been updated so now i can reapply i can reapply the indifferent bit in that case now in bit and i'm going to deploy the fluency manifest with rin bit all right now those has been created last i need to deploy of course a bit so that's in in one cluster but you can obviously imagine a use case where you have multi clusters and where you chain those uh the the this this uh this this this process all right now let's have a look at uh what we have in terms of parts first in dyna trace name space so we have those three let's see what we got in terms let's see also the service because we remember i'm referring to services maybe i don't i want to make sure that i'm not doing any bad plan d logging is there so it's my crying and i have the port 2420 2024 opened now uh let's take one of those uh the container just to see the logs all right so there is no patterns coming in nothing match nothing coming let's see that everything is running so keep cuddle get off and yes first just svc the service in that race namespace so that will be our office a little bit lower service so it's listening let's now see the parts in the diatrace namespace so flinding is um i'm i'm don't know if i'm receiving anything but now let's see in the um the the pods running in the logging namespace so we have a couple of parts uh let's see the logs coming from this one alright so i can keep it minus f to see things but uh yeah there's a pose uh coming in uh again potentially that pipeline is is not accurate uh but i wanted to take an example uh so the idea keep in mind that yeah you can definitely chain uh one flynn bit and one front d together with the help of forward forward is a way of communicating those two solutions between each other or even just from d to friendly or friend bit to flynn bit that's it for today's episode related to flynn d so uh in due today's episode uh we tried to do uh two exercises in the tutorial one uh collecting logs in your cluster with flindy and forward it back to dynatrace so you also saw how to install an active gate within your cluster and second exercise where we always focusing on how to chain a fluent bit and fluent d so when bit was collecting logs forwarding in back two fluid d f and d was just for uh sending it to that trace so the idea is again keep in mind that 5d has more plug-ins so you may need to change sometimes those two products together to have an efficient log stream pipeline alright so if you enjoyed today's episode don't forget to like and subscribe to the channel we will produce more episodes on next coming weeks and if you have any topic in mind that you would like us to cover let me know uh through linkedin twitter or even youtube uh if there is a specific topic that you would like us to cover alright so thanks for watching and see you soon for another episode bye

Info

Channel: Is it Observable

Views: 188

Rating: undefined out of 5

Keywords: cloud, dynatrace, fluentbit, fluentd, k8s, kubernetes, logs, observability

Id: j76ozzIbuO8

Channel Id: undefined

Length: 45min 1sec (2701 seconds)

Published: Thu Nov 25 2021