Context Propagation makes OpenTelemetry awesome

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi when people come to open telemetry for the first time they're often confused by a concept called context propagation if you've never used distributed tracing before context propagation might be new to you so to help with some learning i'm putting together a series of videos that explains this concept we're going to go over an overview of context propagation and why it's useful context propagation and finally what makes context propagation an amazing platform for building any kind of cross-cutting concern let's get started okay so let's dive in what is context propagation to understand that it's best to first understand the problem that it's trying to solve which is basically distributed transactions what's a distributed transaction well imagine you've got a client trying to upload an image and a caption to a service well the most basic version of this wouldn't be just one service actually it would be something like a reverse proxy talking to an auth service talking to scratch disk talking to an application service which is talking to cloud storage and also talking to a data service that's also talking to a sql database and a caching system so even the most basic lamp stacky version of this problem is eight services so that's a lot of services basically what i'm saying is every system is a distributed system when you're talking about the web and in modern times this has only gotten worse systems are becoming larger they're more complicated they're more heterogeneous there isn't just a single monolith anymore where you can centralize discovery of all this information and even if you had a single monolith it would still be horizontally scaled and all of that makes reconstructing the chain of events involved in a particular transaction to be really difficult what you need is some kind of context that you can attach to all of your logs metrics and any other data you're emitting out of your system so that you can observe it if you don't have context you can't identify which events are part of which transactions like seriously how do you do this right now when you're looking at your logs and you want just the logs for a single transaction when that transaction has spanned say five services and each one of those services is running on like 50 servers how do you find just the logs for that one transaction like seriously what do you do it's annoying and the answer is a trace id if you have a trace id then you have an id that's attached to every single event in the transaction and that is critical if you have a trace id then if you find one log you can look up that trace id on that log and find all the other logs in that transaction likewise if you're looking at say a metrics dashboard and you're looking at these events in aggregate and you see a spike in the dashboard you want to know what transactions were causing that spike let's say you're looking at 500 errors well what 500 errors you want to see what events led to that metric being emitted right now generally speaking that's sort of divided up across a bunch of different tools you go look at your metrics dashboard and then you make a guess and then you go into your logging tool and you kind of hunt around so what you need is a way to actually bring all of this data together with indices that are attached to observe observations across all of these different types of signals and that is what you get out of open telemetry and the way open telemetry gives you that information is through context propagation so what is context propagation and how does it work it's actually pretty simple conceptually speaking though i'll admit kind of annoying to build let's imagine you have two services green and blue and there's a set of operations that occur in green that lead to a network call triggering a set of operations in blue well how do you follow this transaction and attach a trace id to it there's two parts one within a process you have something called a context object this is basically just a dictionary or bag of keys and values that follows along the path of execution so as you go from function to function and library to library this bag follows along in the background and you can access it at any time put stuff into it and pull stuff out of it so that's where you store your trace ids and things of that nature within your process in a context object the next bit is this network call how does that work to send a context object over a network call you have to serialize it and turn it into a set of headers we call that propagation basically on the client side you inject the context by taking it and turning it into strings that are keys and values that are represented as say headers on your http call and then on the server side you extract those keys and values from the headers and you turn them back into a context object which can then follow along the path of execution and you can continue this on down the stack if blue talks to yellow and so on and so forth you just keep serializing that context up injecting it into the headers extracting it on the other side so on and so forth this allows you to take something like a trace id and send it along and attach it to every single log or metric that you happen to emit along the way so in short context objects follow the path of code execution within your service propagation attaches context to network calls and sends it from service to service that's all you need to know but there is one more little bit that involves configuration what headers are you using for propagation unfortunately there are a bunch of options which honestly really sucks there's the new w3c trace context headers these are the official headers that are now part of the http standard so that's great but there's also the zipkin b3 headers which are sort of a de facto standard that's been around for a while there's also a whole slew of custom and proprietary headers i'll note one the aws x-ray headers the x amazon header that's one but there's a bunch of proprietary things lightstep used to have its own headers so on and so forth this is a bummer because systems will break if the client serializes one set of headers and the server is expecting a different set of headers so what do you do there well check what your services are actually using if they're using open telemetry by default that will be trace context so if you're starting from scratch use trace context this is the new standard so just use that it's got some advantages to the other ones we won't get into that right now if you already have some kind of tracing deployed and that's using b3 headers then you should use b3 headers that'll ensure that your new services when they're deployed talk to the existing services there's nothing wrong with b3 so it's fine to use it there's only one final gotcha which is aws lambda i just want to call this out if you happen to be running serverless on lambda that currently only supports the amazon x-ray headers the lambda clients uh automatically use those x-ray headers but if you're deploying services on lambda just know that you have to use those headers that's just a gotcha that people can fall into but basically use the trace context headers and that is that okay so we've covered context propagation basics what it is and why it's useful in the next set of videos we'll be diving into the individual apis how to use them and how to build interesting stuff on top of them if you're interested in that consider subscribing so you'll get a notification from those videos come out also if you liked this and found it helpful please like the video or leave a comment that's the only way i know that you found it interesting i hope you enjoyed it and i'll see you next time you
Info
Channel: Lightstep
Views: 1,403
Rating: undefined out of 5
Keywords: apm, microservice, performance monitoring, distributed tracing, distributed systems, opentelemetry, metrics
Id: gviWKCXwyvY
Channel Id: undefined
Length: 9min 40sec (580 seconds)
Published: Fri Apr 30 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.