Introduction to Linkerd for beginners | a Service Mesh

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] you have an existing kubernetes cluster you have a micro service architecture and you're interested in a service mesh in this video we're going to be taking a look at a service mesh called linker d in our architecture we have a video html website it's exposed via an ingress controller on servicemesh.demo home when it loads up it makes a call to the playlist api also exposed by an ingress controller over servicemesh.demo api playlists the playlist api retrieves the playlists from a redis database for each video in the playlist it calls the videos api which is a private service in the cluster it has its own raiders database after the playlist api has retrieved all the videos from the videos api it sends the json data back to the videos web which renders it in the browser so in this video i'm going to show you how to add linker d to your existing kubernetes cluster and then cherry pick features that i believe might be valuable to you linker d is very subtle discrete and one of the most least invasive service meshes out there which means you can install it with ease and easily remove it and easily opt in and out of certain features and add it to certain micro services so i'm super excited and we have a lot to talk about so without further ado let's go [Music] so if you're new to service mesh technology i've made a video which is a subtle introduction to service mesh and service to service communication the link is down below so be sure to check that out in this video if you take a look at my github repo i have a kubernetes folder and in there i have a service mesh folder and in the service mesh folder i have a link id folder with a readme this is a readme that shows you the introduction to linkedin with all the steps and examples i'm going to show you guys today so be sure to check out the link down below to the source code so you can follow along so the first thing we're going to need is a kubernetes cluster now in this demo i'm going to use a product called kind which is used by the kubernetes community to build and test kubernetes and docker containers so to do that i'm going to say kind create cluster i'm going to call it linker d and i'm going to run a kubernetes cluster with 1.19 so this will go ahead and spin up a one node kubernetes cluster with kubernetes 1.19 in a docker container so now i can say cube ctl get nodes and we can see we have a one node kubernetes cluster ready to go now the user will open up the browser and go to servicemesh.demo forward slash home this will load a bunch of playlists with videos in them so to show you the architecture when the user opens up the browser and hits service mesh dot demo slash home the request will come to an nginx ingress controller which will then be routed to videos web that's the page we're seeing in the browser it has a list of playlists with videos in them when this page loads for the first time the browser will make a call to servicemesh.demo api playlists which is a get request to the playlist api this will come from the browser to the ingress controller and go to the playlist api the playlist api will retrieve the playlist data from the playlist database this is a redis instance running inside of the kubernetes cluster it retrieves the playlists from raiders and for every playlist it receives it makes a private call to the videos api to fill out all the video data and video content like the title the thumbnail and so forth remember the playlist api has no domain knowledge about video content since it's a micro service and only knows about playlists it has to ask the videos api or video content and the video api has its own database once all the video contents are sent back to the playlist api and the full playlist has been retrieved it is sent back to the browser for rendering the explanation of this full architecture and every component is documented in my github repo under kubernetes service mesh in an introduction readme file this introduction talks about every single component how it all works the traffic flow the full architecture as well as how to run it with docker compose so you can say docker compose build docker compose up and access it over local host and also how to run the full architecture on a kubernetes cluster which we're going to do today so to deploy all these applications and starting with the introduction to linkid we have created a kubernetes cluster the next step is to deploy our microservice architecture and the first thing we're going to do is create a new namespace for our kubernetes nginx ingress controller and then what i'm going to do is say cubect i'll apply and apply my ingress controller manifests that's going to go ahead and deploy the kubernetes nginx ingress controller to that namespace and then i'm going to apply all the manifests for these applications so i'm going to deploy the playlist api the playlist database the videos api the videos web and the videos database so to do that i'm going to copy paste all these cube ctrl commands to the terminal and that's going to go ahead and apply all the yaml files if you're interested in the yaml files you can head over to the kubernetes folder service mesh applications and you can see all the applications are separated out here and to make sure all our applications are running i can say cube ctrl get pods and we can see they're all up and running we also want to make sure our nginx ingress controller is running so we say cube ctrl get pods on that namespace and we can see we have two nginx ingress controller pods ready to go now for this demo because we don't have a real dns we probably want to fake a dns name so if you're running windows you can open up c windows system32 drivers etc and open up your host file using administrator right and you can add the following entry into the host file if you're running linux you can edit the etc host file as well now to access our nginx ingress controller i'm going to open up a new terminal and what we're going to do is we're going to port forward to the nginx ingress controller on port 80 and that's going to go ahead and connect to that so that we can open up services.demo in the browser and if we open up the browser we can say servicemesh.demo home slash and we can see all the playlists have been loaded with all the video content in them if you press f12 and you open up the developers tools you can open up the network tab you can go ahead and refresh this page and in the network call you should see a playlist call being made so you can see the browser loads up and fires a request to service mesh.demo api playlists and this gives us all the data that we're now seeing in the browser now the next step for us is to go ahead and download and install link d now this is just a personal preference but for demo purposes i like to install everything in docker containers that allows me to isolate all these things that i need to install inside of a container and i can throw it away when i'm done so to do this i'm going to run a tiny alpine linux container i say docker run minus it i mount in my home directory because that's where my cube config is to access my cluster i also mount in this working directory so i mount this entire git repo into a folder called work and then i set that working directory to work i also run this in host networking so we can access the host network and i give it a shell terminal so that'll open up a terminal now i'm inside the container and i can get to work so the first thing is i'm going to install curl to download linker d as well as nano which is just a lightweight text editor and the next thing we're going to need is cube ctl so i'm just going to say curl and i'm going to pull down the latest version of cube ctl give it execution rights and move it to user local bin so i go ahead and run that that's going to go ahead and download cube ctl move it to local bin and now i can say cube ctl and it's installed ready to go i can also set the default cube ctrl editor to nano and to test whether everything works i can say cube ctrl get nodes and we can see we have access to our kubernetes cluster i can also say cube ctrl get pods and we can see our full architecture is running in our kubernetes cluster and the next thing we're going to need is linker dcli so to download linkid you can head over to the github repo onto the releases page and just download the binaries from there so on the linker d github repo we're going to be downloading the edge 20 point 10.1 version you can go ahead and grab the binaries from here or you can just use curl to download it directly so i'm gonna say curl and i'm gonna pull that binary from the releases page from the download section when that's downloaded i'm gonna give it chmod execution rights and i'm gonna move it to user local bin and then i can say linker d dash dash help and we can see it's installed and good to go so the linker dcli has this great ability to check compatibility with your cluster this is great to ensure that your cluster is suitable for the mesh whether you're running the right version of linker d with the right version of kubernetes and it'll tell you what is incompatible with the mesh so to run these pre-flight checks i can say linker d check dash dash pre and this will go ahead and check a whole bunch of stuff in the cluster and give us a full report so it tells us whether it has kubernetes api access the right version of kubernetes it does a bunch of pre-kubernetes setup checks so you can see here it's also warning us that we have a bunch of v1 beta api extensions that's going to be deprecated beyond kubernetes 1.22 it also checks a bunch of creation permissions it checks capabilities as well as the version of linkerd so what i like about the linkrd cli is every version of the cli will generate the manifests for that version of linker d so to install it you just say linker d install and it will go ahead and spit out all the yaml files and manifest that's needed for that version of link d now to see the manifests that it outputs i can say linker the install and i can send all that output into a file so i created a file on the left here called kubernetes service mesh linker d under the manifest folder i'm going to output the yaml there so when i done that we can see on the left here we have kubernetes service mesh linker d manifest and here is the yaml file right here this has the namespace the rbac roles all the deployments and permissions config maps and everything we're going to need to deploy and get linkedi up and running and to go ahead and install that manifest i then say cubect i'll apply and i apply that yaml file to my cluster that's going to go ahead and install all the linkade components inside the linker d namespace i can then say watch cubectl minus n in the link id namespace for getpods and we can see all the components for the linkerd control plane is now coming up and after a minute also we can see all the pods in the linkedin names based on our up and running so our linkedin control plane is good to go i can then do a final check by saying linker d check and it will go ahead and run a check against that control plane to make sure everything is configured everything exists and it's all good to go some of the things that i like the most about linker d is the great user experience it has some of the best user experience and documentation i've ever seen it's extremely easy to install and it's one of the least invasive and intrusive service meshes out there all of this yamalife just showed you deploys itself in its own isolated namespace meaning it doesn't go ahead and tamper with your existing cluster it doesn't go ahead and change a bunch of stuff this makes it easy for us to cherry-pick features we want and we can turn on certain services to become part of the service mesh and if you decide you no longer need a mesh you can simply uninstall it so if we head over to the linker d architecture documentation we can see linker d has a very simple and well explained architecture linker d has this concept of control plane and data plane all of these pods we've just deployed basically form the linker d control plane and everything we add to the mesh basically becomes the data plane so we'll have our applications running and the linker d proxies that will be injected into those applications to form parts of the mesh this is what forms the data plane so the data plane is pretty much where all our apps run and the control plane are all these pods that you can see up here that we've just deployed so let's go over the architecture so you can see there's a link at the web pod and the link id cli as well as this web pod talks to the the public api which is basically this link rd controller pod over here we'll take a look at the linguity web dashboard in a second and then we have this linker d tap pod over here and this provides this tab component which is a tapping feature in service mesh which allows you to tap into service to service communication i'll show you more of that in a bit then we have the linguity destination pod which is basically this component over here which all inquiry proxies interact with is basically a lookup service that tells the linkery proxies where to send requests then we have the linguity identity pod that is the identity component over here which is basically the certificate authority and provides certificates for mutual tls between all pods that are part of the mesh then you have this proxy injector over here which is running right here and this is the guy that's responsible for injecting proxies into the application pods then you have the linker dsp validator this is the service profile validator and in a bit we'll take a look at what service profiles mean linker d also deploys its own prometheus and grafana version which is going to give us metrics for all the pods that are part of the mesh so to access the linker d web dashboard we can simply run port forward and we can run cube ctl in the linguity namespace port forward to the linker d web service on port 8084 then we can open up the browser go to localhost8084 and this will lead you to the linkrd dashboard the first thing you'll notice is that it has a quite a nice overview of namespaces the control plane and all the workloads running in your cluster and it allows us to get a view of http and tcp metrics over namespaces cron jobs demonstrate deployment jobs pods replica sets replication controllers and stateful sets we can see all our namespaces here and this allows us to drill down into the namespace and we can see deployments and pods replica sets all listed here we can also drill down into namespaces so we can go into our nginx ingress namespace and we can see our nginx ingress controller also running here we can also see things like traffic split tools like tap top and routes and we'll cover those in more detail in a bit now the first thing we need to talk about is the concept of meshing now you can see here if i select a namespace and i can drill down into a namespace none of our services are currently meshed now meshing is a feature that allows a specific pod or a deployment to become part of the service mesh by default none of your pods will be affected and you have to manually opt in for your service to join the service mesh now when a workload wants to become part of the service mesh linker d will then inject a proxy as a discrete sidecar into every pod that's part of that workload at that point that service will become meshed and it will show up in our dashboard over here so if you look at the dashboard on the namespaces section none of the namespaces are currently meshed other than the linker d namespace now there's two ways to add a pod to the mesh if we do cubectl getdeploy playlist api we can see we can return an object from cube ctl and the other feature that cube ctl has is to say type o yaml that will output this deployment as a yaml file as you can see here now what linker d allows us to do is linker d can grab this yaml file and inject the proxy into the yaml file so we can say cubectl getdeploy output as yaml and then pipe that guy to linker d inject and when we do this we can see now that cubectl takes the yaml out passes it to the linker dcli and the linker dcli will add the small annotation onto that deployment so you can see it says template metadata annotation linker d dot io slash inject enabled this is all you need to add a deployment or a cron job or daemon set as part of the service mesh so to show you again i can say cube ctrl get deploy output is yaml piper to link or d inject and then pipe that result to cube ctl apply minus f that'll go ahead and pull the yaml out using cube ctl send it to link or d command line linker d will inject the annotation and then we'll take that output and pipe it into cube ctl apply this will manually inject the annotation into our deployment and allow it to become part of the mesh so if i go ahead and run that command we can see it injects and on the left here we can see it's automatically now spinning and it's becoming part of the mesh now you can see the playlist api is one out of one it's fully meshed so it's become part of the link rd service mesh so the thing to note here is that we've just manually injected that annotation into our deployment so if you're using something like git ops or a ci cd pipeline you probably want to put that annotation on your deployment otherwise your ci cd will override what we've just done manually so the link of the inject command is a great way to temporarily just add your deployments to a service mesh but if you want to do it permanently you'd have to go to your git repo where your deployment file is so in my example i have kubernetes service mesh applications folder i can go to something like the playlist api deployment.yaml and i can go to template metadata annotations and i can add this linker d inject enabled annotation that means when i deploy through something like jenkins or flux or argo it'll automatically stay part of the service mesh so what i'm going to do is i'm just going to run cube ctrl get deploy on all my deployments and i'm going to pipe all of them to the linker d inject command and apply them back including the nginx ingress controller so i'm going to go ahead and paste that to the terminal and we can see all our services will form part of the mesh if we take a look at the mesh here we can see all of them are now joining up and you can see all my deployments are now meshed so it all works via the annotation which makes it easy if you want to opt out of the mesh you can simply go and remove this annotation from your deployment and just apply it back to the cluster now it's also important to note that when you change the annotation on a deployment kubernetes will roll out new pods for that deployment so we're going to have to close and open that port forward to our nginx ingress controller once again and then let's go ahead and open up a new terminal and what we're going to do is generate some traffic to our application so what i'm going to do is i'm going to run a powershell while loop that's just going to run and form a curl request to the service mesh.demo home endpoint and it's also going to simulate the call to the playlist api in a loop as the browser would and it's going to do this every second i'm going to go ahead and copy paste that to the terminal which is going to start generating some web traffic to our application now if we go to the linkurd dashboard we get a pretty nice ivory tower overview of namespaces in the cluster so we can see we've meshed our default namespace and now we can see things like success rates of http metrics the requests per second 50 percentiles 95 and 99 percentile latencies and if we go into the default namespace we can see the playlist api is making requests to the videos api we can also drill down now into each deployment so we can see the playlist api playlist database videos api videos database and video web all meshed we can see the success rate is 100 across all of them we can see requests per second as well as latency metrics and we can also drill down into the playlist api and we can see that the playlist api is sending requests to the videos api and we can also see that there's a fan out because there's very low requests per second coming in and seven requests per second going to the videos api so the playlist api makes a bunch of requests to get video content if we scroll down we can also see live calls so we can see live traffic going to the videos api you can see the counters are incrementing over here you can see the best and worst and the last latency as well as the success rates we can also see that all the traffic is coming from the nginx ingress controller so the browser is making a request to the ingress controller which is forwarding it onto the playlist api the playlist api then fetches all the video content from the video api as we can see here and if we go back the cool thing that you can see here is that every single deployment has a link to a grafana dashboard so if you click into that this will take us to the grafana instance running in the linkrd namespace and this gives us a bunch of blackbox metrics for all the pods that are part of the mesh regardless of the programming language that we've used the cool thing about it is i've had to add no code to any of my applications so i get all the stuff out of the box by just meshing my pod so we can see success rates to the playlists api the number of requests per second we can see the inbound deployments as well as outbound deployment the outbound here is our videos api we can see inbound traffic this is all the traffic coming in from the ingress controller and the outbound traffic so this is all the outbound traffic going to our videos api it also describes the outbound deployment here in detail as well so we can go down and we can see the success rate to the videos api the request per second as well as the latency so this is very impressive the service mesh gives us direct links into grafana dashboards with live telemetry for requests per second and latency percentiles no matter the programming language we use we don't have to change a single line of code in any of our services all we need to do is add an annotation to our deployments and have our pods join the service mesh so now we see that linker d gives us these great observability and latency and requests per second metrics out of the box but let's pretend the development team commits some buggy code to the videos api how can linker d and all this metrics help us detect faults within our network so what i'm going to do is i'm going to go ahead and edit my deployment i've added some buggy code to the videos api which i can enable with an environment variable so i say cube ctl edit i edit the videos api deployment and if i go down to environment variables we can see that we have this environment variable called flaky and i'm going to go ahead and set that to true that'll go ahead and roll out a new videos api pod which will have some buggy code in there now we go back to our namespaces and we notice that the default namespace has a 92 91 success rate that keeps going down and if we drill down into that we can see the playlist api has a 42 success rate the videos api has a roughly 87 success rate so with this observability we know now that there's a very low success rate between these two deployments we also know that the playlist api makes requests to the videos api so firstly let's jump into the playlist api and see what is going on we can see both of them have very low success rates and when we take a look at this deployment and we scroll down to live calls we can see the live traffic going to the videos api and we can clearly see here that there's a low success rate for requests going downstream and we can also see that the calls coming in from the nginx ingress controller has a very low zero success rate so we can see that there's a problem with the playlist as well as the videos api so we can see there are network problems between the ingress controller the playlist api and the videos api before we make assumptions on where the problem is let's see if we can use linker d's tab functionality to tap into the network to figure out where the problem lies now tap is a feature that allows us to listen to traffic streams for a given resource so we can listen to network streams between pods in the service mesh you can either use the command line and run linker d tap command on a deployment or you can use the dashboard so we can go into the playlist api and we can look at these live calls and there's a little icon here that allows us to tap the requests or we can go to tools click on tap find the resource in our case it's the default namespace and we can start with the playlists api and we can say start to tap it'll also give us the command we can use in the terminal if we prefer to do so and let's leave this running for a couple of seconds and let's stop and what we can see here is very interesting all requests coming from the ingress controller are all just spinning means looks like they're timing out and if we look at the downstream the playlist api we know is making a call to the videos api and we can see here that it's getting 200 http success codes but it's also intermittently getting 502 and if we take a look at google what a 502 error means it basically means that we've received an invalid response from the upstream server so that means we're getting an invalid response from the videos api so it's starting to become more clear that the problem lies with the videos api and the way it responds to the upstream now this is a pretty clear example of a cascading failure in a microservice architecture now the team who's responsible for development of the playlists api is not handling these invalid responses coming from the downstream server the team who's responsible from the videos api have checked in some buggy code so now that we've identified a problem the team who's responsible for the videos api will go and have to do a bug fix so we can see that linker d gives us this great overview and insights of what is happening on our network now i briefly mentioned the concept of service profile so let's see how a service profile can help us to tell linker d how to handle requests for a given service now linker d allows us to use this concept of service profiles basically it allows us to tell linkurdy how to handle requests for a service now there are a few ways we can generate this service profile we can either use a swagger protobuf auto creation or template so i'm going to show you the auto creation and template mechanisms which is the easiest way to generate a service profile so we can auto generate a service profile using the auto creation method we can say linker d profile and we can profile in the default namespace we want to profile the videos api service this will profile all traffic for this guy and then we'll say tap the deployment and what we're going to do is say tap duration for 10 seconds so this will run the linkery tap command on this deployment and generate a service profile template for us so we can see it's created this yaml output for us which is basically a service profile and during the time it tapped the service we can see it's received all these different get requests so what i've done is in the kubernetes service mesh linker d folder i've created a service profiles folder and a service profile yaml for the videos api so i'm basically going to take all of this yaml we see over here i'm going to copy it and i'm going to paste it into this file and what i'm going to do is i'm going to remove these extra conditions that we have here for these extra requests and i'm going to name this guy get all and i'm instead of just getting a single request i'm going to target all requests so this is a regex so i can just say dot star this will target all get requests coming to this video api and now the service profile allows us to set up things like retries timeouts and retry budget so we can do auto retries and a bunch of cool things with the traffic coming to this api so we know that the failures to this video api is intermittent so let's go ahead and configure auto retry so linker d has great documentation on configuring retries using the service profile so we've answered the questions already about which services we want to retry on and we want to know also how many times should we retry the request so adding automatic retry is quite simple we just have to add this is retriable field to our service profile spec if i go to the service profile spec we can simply add this guy to here and that means that this service profile will tell linkery to retry all connections coming into the videos api so after creating the service profile we can go ahead and apply it using cubect i'll apply i can go ahead and apply that yaml file and that'll go ahead and create the service profile and to see the effects of that service profile we can use the linkrd routes command so we can run it on the default namespace on the deployment of playlist api to the service videos api and this will print out all traffic going between the playlist api and the videos api so we can see that there's a hundred percent effective success rate now so the customers are getting a hundred percent success rate although the actual success is only 70 percent so that shows us that this the proxies between the playlist and the videos api are retrying the connections and we can see that the effect of rps is 7 requests per second and that the retry is 11 requests per second so there's extra requests going out of the playlist api to the videos api for auto retry and if we hop back to the link rd dashboard for deployments we can see now that the playlist api has fully recovered so it's retrying connections to the videos api so now the customers are not getting any impact and i can go ahead back to my browser and i can confirm this by refreshing the page and we can see we don't have any downtime now so by configuring auto retries our playlist api is now retrying the connections to the videos api which has temporarily fixed the problem for us now it's important to understand the effects of automated retry too much retries can put a lot of extra strain on the network now linker d allows us to configure retry budgets as part of the service profile so we can define a fair balance between not too much retries and just enough retries to solve these error rates so we can confirm this extra network strain by running the linker d routes command and we can see that there is effectively 7 requests per second but the actual request per second is 11. so there's a lot of extra requests coming out of the playlist api and hitting the videos api now although the latency percentiles may seem low over here if we go to the dashboard we can see that the playlist api latency percentiles are much higher than they used to be they used to be roughly 10 milliseconds and they're now almost up to 40 milliseconds so you can see the overhead of when adding too much retries now we can change our service profile yaml to include a retry budget where we can add retry ratios minimum retries per second and a ttl and then run the linkadi route smart command again to monitor that overhead and tweak the service profile accordingly now the other cool thing about adding our pods to a service mesh is that we get mutual tls between all these micro services that are part of the mesh that means the traffic between the playlists and the videos api is automatically encrypted with mutual tls even though my code base thinks they're talking over port 80. so i don't need to add any code and i don't need to rotate certificates myself now to confirm this we can run the linker d edges command on deployments in the default namespace and we can see that all of these pods are secured so the sources the playlist api to the videos api is secured we can also verify this using the linker d tap command and run tap command on the deployments and if i do this in the default namespace this almost looks like a tcp dump command so if i run this for a couple of seconds and hit ctrl c we can go ahead and look at all these requests and you can see that they're all tls equals true so we can see here's a clear example of a request going to our videos api from the playlist api and it has tls true on it so this shows us that all the traffic between the playlist and the videos api is encrypted with tls now i know that was a ton of information about getting started with linkerd but i hope you were able to absorb all of that information be sure to check out the link down below to the source code and follow along and try it out and let me know down in the comments about your service mesh experience and in the next video we'll be taking a look at slightly more advanced topics such as canary deployments and tracing and also remember to let me know down in the comments what sort of features and topics you'd like to see in the future remember to like and subscribe hit the bell and check out the link to the community server below and if you'd like to support the channel further be sure to hit the join button and become a member and as always thanks for watching and until next time [Music] peace you
Info
Channel: That DevOps Guy
Views: 11,662
Rating: 4.9633865 out of 5
Keywords: devops, infrastructure, as, code, azure, aks, kubernetes, k8s, cloud, training, course, cloudnative, az, github, development, deployment, containers, docker, rabbitmq, messagequeues, messagebroker, messge, broker, queues, servicebus, aws, amazon, web, services, google, gcp, servicemesh, linkerd, istio, mesh, envoy, proxy, sidecar, networking, networks, https, tls
Id: Hc-XFPHDDk4
Channel Id: undefined
Length: 32min 43sec (1963 seconds)
Published: Sun Oct 18 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.