Getting Started with Logging in Kubernetes - Eduardo Silva, Treasure Data (Any Skill Level)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good I know that it's a bit late after lunch is not always a good time to attend a talk so thanks for coming please take a seat yourself eh my name is Eduardo Silva I'm a software engineer at this company which is called a treasured data and as the company name said we care a lot about data we have a assess platform so you can ingest your data your marketing data or whatever and you can get some insights from your data but when you have some kind of platform to manage data there is one missing step you can you can have a platform but what really gives you the value is the data and that's why we got into this world of logging like four years ago creating some data tools in an open source way to collect data from your old services or your own applications so and this talk is about how to get started with login in coronaries our company is one of the sponsors also of a project flew in D and flew in bed would you please raise your hand whose use you flew in D right now ok this is micro thanks so I said I am a mobile software developer at Trisha Dara I am part of a fluent D open source team but my primary role is also maintain and develop it this little project which is called flow in bed that maybe you are hearing a lot about it in the last year so let's talk about applications and login when you have a some kind of application they all the way to perform login is just to write some message to some special place and sometimes this is the hardest or a log file and that for that for us is called logging if you are over 30 years old you realize that you play a lot with syslog file system log files if you are in your 20s maybe you're looking on what is that you're using Journal d or something more fancy right now right okay but login a happens in different ways not just to the file system when you have a unique process running in your environment actually you have like three main channels of communications the first one is a standard input and which can be used to send messages to the application when you're using your terminal and/or stand Mac OS or Linux and you type some common and put a pipe what you're doing it just redirecting the output of one program to the input of the other and piping happens because of this streaming interface but when we talk about login in general and specifically in the container world we are talking about this hey we have some lag here Oh battery there you go I'm going to stay on this side so and the login happens specifically on the standard output and understand that error interface and if you are dealing with containers do you realize that this sounds a little bit familiar so log in and in talker containers I assume that most of you know most of you but help you are using fluently but who's using docker at the moment in sublevel okay 80 percent cool so undocking has also its own strategy to perform logins now first of all I understand that you know what a container is right a container is not docker token is just a wrapper a tool that allows to create this concept of container a container is a process random in a container I set space which is restricted by the Linux kernel meaning that you have some namespace and cgroups which allows to say to this application you only have access to the filesystem or maybe not you have access to a network or maybe this amount of CPU time but dealing with those kind of system calls or using the common light with Alexi takes some time and that's why docker allows you to provide a full interface on top of that and manage containers and also it cares about logins so if one application sent a message like hey you come actually what will happens is that docker will say ok I got a message but that message is not like a son alone message like in the old log file because we have some kind of context and this context means from which kind of stream this information is coming from for example the standard output and at what time this information was created and that is really really relevant and if you look carefully I don't know how you look at that the log files of docker ok yeah nobody do it you don't want to do it right because it's like a path a lot of hash and things and not very clear about that you don't need to it's a waste of time unless you are trying to create some login solutions so my goal here is try to explain a little bit how logging works behind the scenes because if you understand that you can optimize in a higher level ok so when you get a for example a a message and your application Doka what we do is create a JSON map assuming the use in the JSON driver this journal D driver and others but we are basing on JSON and it's going to store these in the file system where the docker engine is running and basically it will use bar live docker containers slash the doctor ID hash every every unique identification for the container it's a store in the file system and then it appends the final log file so from an operational perspective this is good because if you want to manage locks for doc you had to discover the log files you had to read each container log file to realize what kind of messages the application in the container is generating also I don't know if you look carefully in the previous slide but the message is stored in one key which is called lock right the low key is important one because that is the one they contain the message and of course you want to obtain some kind of metadata so let's try to look very quickly about how this works locally so um if I'm not wrong I'm going to use my comments here so we don't waste time with typos I learned that in the past conference so basically we are going to run a docker container with busybox and just print out the message but I'm running with a demon flag which means that runs behind the scenes but also will give me the container ID that is the one that I want to use later to gather the logs in the filesystem of course you can use the token locks common but here we are doing the manual stuff okay this is the container ID in the container hash okay of course you're not going to remember that but what I want to do is to dig into my file system and see where this exists I'm going to become sudo Bar leave docker containers oh we have a bunch of containers so is that fine or too small okay thanks you're still young I'm not so if you do a query here in there in the pad you will see that the container has many information as you said it with it like the colon sex if it wants some volume and so on but here we just care about one piece which is the log file if I cut the log file I will find my message here is it's a JSON message in the filesystem now we can do this like with JQ it's more beautiful okay that's it your application to get a message you got outside through some streaming interface and then it wraps by the docker engine dr. engine creates a new log file and this log file can be seen by everybody or the docker tools that does exactly the same that I'm doing right now okay I wanted to show you that because you need to understand that before to move forward so present so applications in kubernetes a quick overview who's running kubernetes right now ok so I got an idea ok pretty quickly you are going to see the concepts you have one application in docker you have one application runs in a container and that's fine but here we have a few additions for example an application runs in a container but for kubernetes a container is not like an object or a concept what kubernetes is about knows about it is a pot and the pot is a concept which allows you to group different containers so for example if you have a web server and you have some database you can run both in the same pod because it belongs to the same context that's an example you can run it on that way or not ok but you can have many a containers running in the same pot usually you have one but you can have many but a node a node I mean a host a physical machine or a beautiful machine can have many pots that means multiple applications and you know that kubernetes cluster has many nodes so they think is how do you deal with blogging when you have a cluster when for example you deploy your application your application got some replica replicas in different nodes how do you gather the login how you centralize the logs because kubernetes is really good for this it has a scheduling you can create your own up kion's self-healing and all the things but for logging we have new challenges it's not enough to go to the file system as I did right now and query each log file because ten years ago if you have some issue in some application you used to SSH into each server or use some kind of remote syslog to do some troubleshooting but here the things change is a game changing and I would say that is for good for who manage login of course we need to do more work actually I'm sure that nobody here finds that login is fun because not login is not fun but if you are here is because it had to deal with it right and we had to fix it so a login context in kubernetes in Indo Kern we used to have like the pipeline standard output maybe in the log message and we have in the time stamp but if you are running in kubernetes there is some extra information for example the pot name which pot this belongs to the poor lady because maybe you can have the same pot name but restarted like ten times and the Paradis will be different the namespace that it belongs to the node the labels and annotation do you know what are the labels No okay imagine that you have several applications and you have different environments for example you have production you have testing and development how do you make some differentiation between them okay I create namespaces cool you can create namespace and group the port's associated with development on that namespace or in production or testing whatever but if you want to query the API server in kubernetes to say please show me all databases from the three environments you use levels means that when you create the port you attach some kind of label to the pot you say and this is a new application version four please show me all information that you have about application for and you can add many labels as you want because labels allows you to do some selection over the resources inside kubernetes and labels are are also the key and annotations to do a better login so if we think about the log processor we need we need to think about we have the logs that is a message but we need to understand what is the context of that message because it belongs to a container name to a pod - namespace hot labels and annotation if we cannot get the context the information is irrelevant so from a log processor perspective I'm talking about in a general way it doesn't matter the log processor that you are running it had to deal with this it's going to get the container name container ID for the file system or journal D but if you want to correlate all this information with the kubernetes context you need to query the API server when you have a kubernetes cluster you have an API server or which is called a master and the master knows about well through it etcd knows all the states of imports or all the nodes so if I'm running a port in node B the API server knows what are the labels and annotations associated with this pot running on that node so at some point you need to take the whole information and merge it so here is what the log processor a needs to do the hard work okay this is not too straightforward from the beginning unless you have the right tools and that's why we have a open source and kind of certified tools for logging so it's like this if you are doing logging it because you want to do some data analysis but in order to perform data analysis do you want to centralize your information otherwise you cannot do it okay and the whole deal of LOC processing and login in kubernetes in general is that you need to check the information from multiple sources unify this information in a central like a database it doesn't matter the database could be like a stream unit of is like Kafka could be Reddy's my sequel elasticsearch or whatever but the important pieces here is a lock processor because if the lock processor is not able to correlate information between the ports between the nodes in between with the levels you cannot get the right insights so lock processing in kubernetes this is how it works basically do you know what is a demon set okay everybody knows what is a pod yeah I just say that right now pod is a container which is running in a node a demon set is a special port which run on every node of your cluster if you're going to deploy your application and you said okay this is a pod no it's not a pod is going to be a demon set automatically kubernetes is going to have at least one replica open that specific application pot on every node of the cluster and if you add a new node into the cluster when you start up and bootstrap everything is going to start a new port because we now put with a specific demon set so if we want to solve log in in kubernetes we want that our log processor runs on every node of your cluster because if you imagine that you have just won a okay one node one built on machine and the spiritual machine has many ports all the logs from those ports are in the file system own in journal D on that specific environment the API server doesn't know anything about logging okay so what you need to have is to have the log processor running locally in the node of course this log processor as a daemon set needs to have special permissions to read the log files from the file system a read is it said partly up containers okay but that it's only symbolic links to the docker engine that I just show you before so what we do is basically deploy the low processor as on set and then we start reading the logs because we're going to read each container log file and that is expensive by Network it doesn't matter if it is file system or Journal D it will be expensive anyways expensive in terms of computing and then we need to go with the second steps which is gather the metadata labels annotations from the API server because we want to merge this information okay let's do a simple an example about how this works okay here I'm running mini cube are you familiar with mini cube yeah most of you if he doesn't a mini cube is like a single instance of kubernetes that runs in a virtual machine that you can run in your own computer but make sure to close slack and all the things elastic search yeah that's why when when I'm stored and starting the demo had to look at the memory available because I had to start mini cube elasticsearch keep an ax and that's it okay so mini cube is running I'm going to look at in the pot that are running on this a specific stuff there's nothing in a default namespace let's make sure that we have okay everything is fine right this is my own single stuff what I'm going to do now is I'm going to deploy a pot an application and this application is just a fake simulator of Apache web server so what it does it just to write a simple log file as a simulate Apache log files to the standard output randomly okay everyone every second okay this is the base image if you have your computer you can run it let's run it here docker run sharing I'm going to run it locally outside of kubernetes just using docker container there you go it just print one message rather message with IP address and messages one per second okay I'm going to deploy the same thing but in my kubernetes single class okay so keep CTL create but patchy that demo so keeps it here get pods watch oh it's a random so let's use now the the kubernetes a login tool to see the logs from that specific pot so it will be cube CTO logs Apache logs and follow the locks there you go so the pod is running but right now those who logs are just in the filesystem of mini cube and we can dig into that if you do mini cube SSH just for demo purposes bar leave containers leave containers okay thank you here the containers this is our friend but that friend of us has more information in the docker context we used to have just hash in the container ID but here we have the namespace we have the name of the pod in the same name of the file right and if we create that file we do a cut you get the JSON message for example this one okay that's fine until now there's nothing fancy okay I'm going to exit from mini cube okay cube City I'll get pods Apache logs is running but what I wanted to do now is to install my log in my log processor as a daemon set run locally take out the logs process the loss and injustice logs into a database for demo purposes and to use elasticsearch we will see if that is working because I reduce the memory usage right now so okay it's still alive cool and just for sanity purpose we are going to wipe the database don't do this at home or in production this is a magic key oh there you go so if we do for example we're using cool like the HTTP client to query the database to see if we have some records cut in this is pretty oh there's nothing we have nothing cool it's empty so the next step is to deploy the demon set of course if you want to install a log processor you just need to get the right documentation the right comments for example I'm going to show the namespaces and have created before a namespace which is called login so I deploy my demo set a namespace which is called logging just for sons having something looking good okay there's no relevant thing here and I'm going to deploy a fluent bit fluent bit which is a log in log processor I'm going to talk more about it later so we're going to deploy is a config map do you know what a config map is yes ok when you have a container running and you want to ingest configuration you have two ways or you create your own container and when you create a container put the log file inside the container but if you want to change the configuration you have to rebuild it or you can pass the configuration files as mounted volumes so when the container start it can see the files as a volume that was mounted ok and a config map is an object in kubernetes that allows you to ingest configuration inside the pot so this is not something that you need to understand or remember by in your mind but this is a configuration for fluid bed this is the input from where I'm going to collect the logs right now I'm going to process only the Apache logs I'm going to say parser for docker and going to filter the data what a filter does on this case go to the API server go to the labels metadata annotations and merge that and then I'm going to ship this information because this is the output to elasticsearch we use es for short but is last search and we are using the magic IP in mini cube with that IP you can talk directly to the to the native host okay so I'm going to ingest the configuration from paid config map okay ready and I'm going to deploy flow n bit the demon set for mini cube this is the docker image for fluid camp then I'm pushing right now that version was released at this Monday so it's like the new version so cube CTL create des mini cube there you go so kissing get ports okay fluent bait is running okay so what that means oh I'm going to query the lesyk search database let me do it some silence here can you see its number that's in the number of records that the database has so from one side and running fluent bit as the log processor inside the node and from the other I'm querying the database and I'm seeing that the logs are hidden there ok let's try to do some visualization localhost here we go whoa discover locks and we are going to create the timestamp thing choose that we need to choose the right things otherwise we don't get the locks and here here the locks for example I'm going to choose this one this entry and then going to the JSON that was ingested into the database if you look carefully here we have the stream the timestamp and we have this new addition of information in that information comes from the API server so when you have centralized all your logs in your database now you can query the information you can say please show me all the logs that a pot name belongs to a patch and you're going to get everything about the patch okay so that was a quick interview so I review about that so I know that most of you are using they are using flu and D and flu in D is more than just a strong logging solution actually flu and D is a it's a fluent ecosystem and flu in D I don't know if you're worried about but flu indeed join at the CN CF it's an inception project incubation break sorry and we are in the process to a full graduation maybe in a couple of months or before the next cube calm but flu India also has like a huge community we have like 700 plugins available for storage in Casca elasticsearch my sequel Reddy's and so on on different filters and when we donated this also a there's a strong commitment from the company to continue growing fluency in all aspects even the community for example if you're writing your own applications in PI 2 language you can use the Python SDK for fluency and you can ingest your login messages directly to fluently but in order to thing make things better sometimes people complain that fluid decap use too much memory in some scenarios we set a creating a new solution and a tentative solution which is called flow in bed and flow in bed is like a C language version of windy but F with a strong focus on kubernetes and docker containers and Fulham beat sealed at 13 was released just this week and one of the future is annotated pots I don't know if you remember when you look at the elasticsearch stuff for example we don't have too much time but we'll try to do it very quickly for example the Apache log here where is look you know that is a row string but in your mind you know that the Apache Lo has an IP at Einstein and HTTP method and status so one of the new futures of flue embed for these cases is that we implemented annotated ports so when you deploy your pot you can say okay this is my pot but also please fluent bit processor used Apache parsers you can suggest that from an operational perspective so let's wait pretty quickly if you look at this pot there's a new notation this one was not there before we are suggesting the path day log processor that you can use this specific person it's the same a container but with a different a level so a notation [Music] okay so that is running we got more hits here and we're going to keep Anna and we're going to do some search search so kubernetes at pod name equals Apache logs annotated ok here the new logs for the Pacific boat and now there's a difference can you see this that was not there before because Apache logs doesn't have a structure but suggest in a parser to the log processor you can get that right away inside your database before this you had to do a lot of magic in order to make it happens and now a fluent bit can do it for you that's one of the biggest thing and another future I'm not going to demo because if they pretty much the same that you can do for example a you can say please exclude my logs you a support you say an annotation flew impede that IO is latch exclude true please exclude my dogs because there are some times that the posts are too noisy are running in debug mode and you don't need that we added matrix for JSON and permit use so that means that you can query also how your login solution is working internally I think that this is one of the most expected futures and since we have a few minutes I'm going to run it locally in my computer I'm not going to run parameters right now flow embed here's so I'm going to run through embed locally in my computer I'm going to use the input a fluid can read many things log files TCP messages but also CPU metrics CPU metrics and send the metrics to the standard output every second ah but I didn't start the web server so I cannot get the metrics Lego okay it's right in the matrix taken some guidance some metrics and pushing those metrics to the standard output now we can use the HTTP interface to query the internals of the web server okay in a beauty way there you go so you can get also some insights from the version that is running through the 13 community ratio they build flags but which is more important the matrix that we have so in the input you have one input scipio we are processing 37 records which belongs to eleven thousand bytes and so on so if you refresh this information you will see how it grows and what you do is specifically is that let Prometheus consume this endpoint but you know oh this is Jason but perimeter has a different format slash Prometheus there you go okay thank you looks like you you love parameters right yeah no it's really cool so yeah so this is one of our additions on also people say from the enterprise we need more Enterprise connectors we solve all the data collection for kubernetes but now we can in want to ingest data in all places so we added support for Asher for Kafka and also Splunk maybe you love it maybe you hate it but we have the support for it because people want it ok and annotated ports you can use the parser to suggest a person and in saying suggest because in the fluent bit configuration you can reject that rules maybe some guy want to deploy a pod which one to screw their locks but if that flag is not a in Abel or allow it in fluent bit it will not happens it with log it area anyways I know teleports we just see that how we can put an annotation well flowing bit with images with primitives and that's whether we show you a now well I don't think that we have time for the last a demo but we're going to run pretty quickly in one minute the start of a project we have so far in less than three years we had like 74 releases more than 32 contributors this is low but we can grow with you guys every day we have more than $50,000 every day since the last month and that means that we are getting more than three millions of do--let's in the in the last year and what means $50,000 fifty-thousand notes that are created everyday somewhere that needs a login solution our choosing fluent bed and this same matrix apply for fluency so people use fluently and fluent that in different scenarios this is how is growing somebody here was testing it but he didn't like it having that in the last version 0.13 we were doing all these tests about HTTP matrix and we did are like 20 versions since January once a week and people was tested testing and people were so excited that some of them said hey I deployed this in production and what happens at the end sometimes we have some crashes from bugs and they say oh but it was running this in production don't do it is a test version but now we are really good it's pretty stable we don't release something but you have unit tests we are going forward now with the CI CD and all the things so a we just started the next version co-taught 14 the development we started just after coop con right now so our goals is to add load balancing for multiple outputs maybe if you are sending to one elasticsearch you can have many at failure overrun robbing a support nest a multi-line in Dhaka JSON logs for example Java stack traces and we need help with documentation in testing with everything that relates to an open source project so you can reach us anytime thank you so much [Applause]
Info
Channel: CNCF [Cloud Native Computing Foundation]
Views: 19,576
Rating: 4.8957653 out of 5
Keywords:
Id: 7qL5wkAaSh4
Channel Id: undefined
Length: 37min 0sec (2220 seconds)
Published: Fri May 04 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.