Logging, Monitoring, and Telemetry in DevOps Projects

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] gentlemen let's get started today our topic our main topic is error handling logging monitoring and telemetry try to describe what is logging what do i mean when i say let's add logging to an application what does that mean for you yes no i'm not talking about log in i'm talking about logging [Applause] logging yes every let's say operation is locked to the console why do you say console it's not wrong yeah into a file now you said console you said a file anything else did you ever hear for instance of the windows event log would that be a suitable log target yeah okay what if you have a server farm think of google you don't have one server you don't have 10 servers you have 10 000 servers do you think it's a good idea to log to the console to log to a file on the local machine so what do we need to database for instance we could log to a database we could use a nosql database or a sql database and log into the database what else let's stick with the database for a second do you think in 2021 it is necessary to build your own custom database to store logs probably not because storing logs is is really only half the way because once you have stored the logs what do you want to do next you want to analyze them right why do we do logging in order to find out what's wrong you have you have an error somebody calls you hey the your stuff doesn't work you have to find out why and how do you do that to look into the log so just storing it in a database doesn't help you very much you need to have a nice ui where you see where you have some filtering possibilities and so on and so on and logs are not a trivial thing because we could just log a text for each request but maybe you want to do a query where you say give me all the requests where the http status code was 500 and where the user was xyz and where the client ip address was this and that if you only have a text it's really hard to find out so what you want to have is structured logging where you have different properties understand what i mean so what what you could use are specialized servers servers that are capable of receiving logs from a large number of servers so if you have a huge array of servers running in the cloud and you want to do logging for all those servers you take the logs of all those applications and you feed them into a logging server that does not only store the data but does also offer you strong capabilities for querying for building dashboards and things like that have you ever heard of a a widely used log server does microsoft have such an um such an offering yes and it is microsoft has an offering which which fits perfectly to asp.net core it's called application insights application insights is a logging system which is part of azure monitoring azure monitoring is a broader set of applications and services for doing logging monitoring and telemetry but the thing that we are going to look at today is a little bit in the direction of application insights i'm not going to teach you how to use application insights in this course because it will would be too difficult to set everything up and to make sure that everybody has access to it and so on we will use a smaller tool which we can install locally just to practice but in real world a log server like application insights would really be important now you store all your logs to application insights what do you use it for what are typical use cases where it is super super super important to have all your logs in some kind of logging database yes the server crashes and you want to find out why right exactly you can take a look and what would you look for the last event because you think that maybe the last event let's say it was an http request led to crashing your asp and core app okay server crashed what would be the reason why you take a look into the log how do you even recognize that your server crashed a customer maybe of course yeah i mean that that's obvious but is it what is that what we want that the customer recognizes the error no yes server monitor exactly what you can do is you can set up health monitoring which is typically done by sending http request to your api and once your api does not answer anymore it sends an alert this alert can be an email this alert can be a push message this alert can be an http request triggering some automatic healing process on your side whatever this alert is all about so health monitoring and logging are tightly integrated we are not going to take a look in health monitoring currently but keep that in mind what are other situations where logs are super important yes exactly if you have an attacker for instance or you think there is an attacker there might be an attacker something suspicious happens on your server and you want to find out who does that and is that legitimate or is that a malicious attacker for ins for instance exactly then you have to take a look at the logs what are you going to look for in the logs imagine you have that situation you have the gut feeling that something is wrong and you have a huge database of logs how do you find out if something is wrong you could look for specific login attempts but what are you going to look for just all login events not i see login event that was not done during work hours so you you want to absolutely you want to find out if somebody did something which is which does not reflect the usual usage behavior of typical customers right and that's the point this is what i want you to take away here that's a huge difference from my server crashed where you want to take a look at specific events to find out whether you have an unhandled exception or something like this while looking at a po looking for a potential attacker means analyzing a larger amount of logs and doing a statistic aesthetic for instance a statistical analysis or training a deep learning network so that it understands what is the usual behavior and detecting if something is strange unusual this is what you said so machine learning is tightly coupled with logging and systems like application insights do exactly that microsoft stores all the log records bless you microsoft stores all the log records and then they are training machine learning models so that these machine learning models understand what is typical for your app and if something seems strange they will inform you i regularly get emails from application insights telling us that maybe something strange happened the load of this web api has risen by i don't know 25 percent and that's not typical at that point in the day maybe this is something strange and then you as the application owner have to find out whether this is a a change in the behavior of your user or really an attacker so crashing servers and attackers what else do you do logging for yes debugging what do you want to debug okay so a functional problem a customer calls you for instance and says i expected the result 10 and your application said 12. what's the problem here the the term here the third thing is no replay debugging typically if a customer calls you and says i expected 10 but i got 12. what do you do you hit a 5 on your machine you enter the data and you look for the result and maybe on your machine it says 12 exactly what the customer expected but for whatever reason your application in production is at 10. now you have a problem you can take you can call the customer and say i'm very sorry i cannot reproduce this behavior i have no idea why the application works like that i'm sorry but that is not sufficient what you want is you want to be able to analyze problems without replaying and that does only work if you have very detailed logs so your application writes every single step to a log database file whatever and you can step by step follow through and understand what happened why did the application say 10 and not 12. so debugging without having to replay the scenario is a very important thing when it comes to logging exactly what else what about performance customer calls you and says your app is slow you press f5 on your computer you spend 5000 euro for a great developer workstation and it is blazingly fast then you tell the customer i'm very sorry on my machine it works blazingly fast everything is okay that will not help your customer so why is it important to have logging for performance analysis ah yes what you want to have in addition to regular logging where you have certain events http request came in http request came out what you want to have are on the one hand side metrics cpu utilization memory utilization network traffic things like that you could just install a small software probe a software sensor that regularly measures the cpu utilization and writes the cpu utilization as a as a metric to your log database so you can correlate the performance problem with your cpu utilization and maybe you find out that the problem for the performance is that you have simply have too many customers and the cpu is your bottleneck or you find out that memory is your bottleneck or you find out that disk io is your bottleneck so metrics is another very important thing that goes beyond just application logging and the other thing that you want to store in logging is timings if you know the customer press the button and it took a minute until the application answered that doesn't help you very much does it you might ask yourself what exactly what operation in my application caused the delay was it the database query was it my own algorithm was it the web server doing something strange with my http request was it some kind of cue where the request has has laid in and and it took forever until somebody picked it up so you want to have some kind of x-ray built in so you can take a look what took how long you want to have a time view where you see which request which step in the request took how long so that is another very important topic when it comes to logging this is why i always say logging monitoring and telemetry with telemetry i mean metrics and timings and things like that monitoring means health checking things like that and logging with this customer okay do you think logging monitoring and telemetry is important in the practice yes oh yes it is have you ever heard about the term def sec ops or simple put devops yes what is devops no idea oh that is important we have to talk about that because that is very relevant nowadays when you do programming in real world in the past old-fashioned companies they work in silos so there is an administrative department as let's say an operations department and then there is a software development department you might work in the software development department and as a software developer you are not responsible for running the application on the production service the administrators are responsible for running the applications on production servers if you create a new version of your app you have to take your version and ship it send it to the administrators and you also have to write a long documentation telling your administrators how to install your app that is boring the administrators will maybe not read your documentation or your documentation is flawed so things will break what will happen the administrator will angrily call you and tell you that your software sucks and it doesn't work properly and it will be a constant struggle so administrators want to have no updates no changes no bugs everything static it should run never touch a running system that's the mindset of a traditional administrator and what do we do as developers we love new stuff right we love preview versions we love to update nougat packages although we have never tested the application properly we love to build fancy stuff so we are the opposite of a traditional operator the traditional developer loves stuff loves new things loves to try things do you see the problem we have two very different mindsets and they are constantly crashing against each other and that is the reason has been the reason for in the last decades that in so many companies you constantly have a struggle between these silos people running the apps people building the apps and when you think of secops there is a third group the people who have to secure the infrastructure they would like to lock everything down the internet is bad their dream is a server without the network card because then nobody can break in do you think that makes sense of course not so we have the security guys who wants to lock everything down you have the developers who want to use cloud services all over the place and then you have the administrators who don't want to have any software because if you don't have any software nobody will call in the middle of the night problem right and you know what devsecops is it's pretty simple exactly all those things combined you build it you run it you secure it your problem you you found a team a team that is responsible for an app and this team has to build a code has to run the code in production has to stand up in the middle of the night if a customer has a problem and this team is also responsible for securing it of course in larger companies there are specialists specialists for security specialists for development specialists for operating stuff and they can support and help the teams if they maybe don't have the necessary knowledge to do all these things but the at the end of the day the responsibility is combined in a single team so if you as the developer if you decide hey i don't care for logging why should i do logging i don't need that now you are suddenly responsible for ops too now you will be you will get a call in the middle of the night that somebody crashes your your app crashed for instance so what do you need logging so you feel your own pain you decided not to do logging and now you need logging or somebody somebody is is is stealing data from you and then your boss comes to you and says you're responsible you build it you run it you secure it tell me if we have a hacker so you feel your own pain you decided not to do logging now you have a problem and that's the very simple idea in modern organizations you have a combination of development security and operations and if you do devsecops suddenly logging monitoring telemetry and error handling becomes super important topics for you as developers because you will feel the pain if you screw up in that area not some anonymous administrator caring for your servers understood and you have you really have to have this mindset um in a devsecops world what we also want to achieve is that you as devops specialists you really take responsibility for the apps you are responsible for it your responsibility does not stop once you have written the line of code it just starts there you're responsible for compiling it but it doesn't stop there you're responsible for shipping it to all your web servers but your responsible doesn't stop there you're responsible for every technical problem for every security flaw for updating everything if a new version of the linux operating system comes out and fixes a critical security bug who is responsible for making sure that you receive that security patch you are responsible and with you i mean your team maybe not you personally but your team has to take responsibility understand what i mean in practice you will find lots of companies out there who have not yet really understood and really embraced the topic of devsecops especially in large organizations you still have organizations that struggle a lot because they are still thinking in silos there are those security specialists who have their kingdom security is my kingdom and i am responsible and i let nobody in administrators they love opening up boxes unboxing servers putting them into rags and dealing with cables because it's somewhat fun know what i mean and that's their kingdom and they will know but they will let nobody in and you as developers you don't want to have sometimes the responsibility because that means you have to stand up at 2am in the night you broke the main branch because you checked in you checked in flawed code you have to stand up at 2 o'clock in the morning and fix it because your colleagues want to ship at the next the next day that's real world this is how devops works at the end of the day understand what i mean questions do you think it's it's fun to work in the devops team oh yes it is it absolutely is because you are responsible end-to-end that's your baby you can build it you're responsible for it you see it grow you have access to the logs you see the customers using it you build a function and at the next day you can immediately take a look how many customers already clicked on that button on your website or your web app that's extremely motivating if you work in a silos team you write a little bit of code then you build everything you send it to the admins and you have no idea when this stuff is rolled out if it works if it crashed if user liked it you have no idea and that's not motivating but it needs a change of mindset taking responsibility is super important and now let's get back to the logging monitoring and telemetry logging monitoring and telemetry is your friend because we want to have security information by analyzing our logs operations information by taking a look at specific events or analyzing our log and in the development area we want to have information about feature usage i built a certain feature and i want to know if i wasted my time because it's a waste of time to build a feature that nobody uses and how do you find out if somebody of your customer uses that feature logging monitoring and telemetry right so many listen to that that's important so many teams they really invest a lot of time a lot of money in building super high quality software they are doing the latest and greatest technologies they are building unit tests and integration tests and automatic software deployment and at the end of the day they ship something that nobody uses that's a waste of time i understand what i mean that's really a waste of time don't waste your time and how do you make sure that you don't waste your time if you find out what people really need and what they really use this is why microsoft is constantly bothering you asking you hey can we please download telemetry data from your computer to see what you're doing with windows that's why the angular team in the angular cli when you create the project ask you hey are we please allowed to download telemetry data so that we understand how you use the angular cli so we can spend our time at the right features and not for features that nobody is interested in understand what i mean that's the reason why so many companies go for software as a service because if you run the software on your servers you immediately have access to all the logs if you ship software to your customer and they install it on their computers in their data center do you ever get something back from them do you have any idea what they do with your software well probably not because typically you don't get the logs understand what i mean so this kind of thinking is super important and that brings us to the importance of logging monitoring telemetry for the development area you have to understand your customers and what they really want and what they really use good and what do you do with a soft with a feature in your software where you find out that nobody uses it in a modern team what do you do with that feature delete it it's allowed it's okay to remove features from a software if they are not used that's okay unfortunately most teams don't do that they have spent weeks of development in this one report to generate it for the customers and nobody uses it and what do they do they keep it in the software that means they have to maintain it they have to update it understand what i mean why do you do that modern companies remove features that's okay sometimes it's okay to tell your customer i'm sorry only a few customers use that features i had to remove them because we don't want to spend money on those features okay so now you know why we need logging monitoring in telemetry and now we will walk over to the laptop we will create a web application and we will add logging monitoring and telemetry to our web application and see how that works with asp.net core fair good let's do that yeah question
Info
Channel: Rainer Stropek
Views: 654
Rating: undefined out of 5
Keywords: Logging, Monitoring, Telemetry, DevOps, DevSecOps, HTL Leonding
Id: vtgJyROejFU
Channel Id: undefined
Length: 26min 6sec (1566 seconds)
Published: Thu Nov 11 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.