Autoscale applications on Kubernetes with Kubernetes Event-Driven Autoscaling (KEDA) | Azure Friday

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
>> Hey friends, are you struggling to autoscale your applications on Kubernetes or do you want to run your Azure Functions on it? As we all learned, choosing your compute infrastructure defines how you scale your applications. Tom, an Azure MVP, is here to show me how Kubernetes event-driven auto-scaling or KEDA makes application auto-scaling so easy, even I can do it. I'm going to learn how today on Azure Friday. [MUSIC] >> Hey friends, I'm Scott Hanselman and it's Azure Friday. I'm here with Tom Kerkhove and we're going to talk about something exciting which is autoscaling. I like anything where I don't have to do the work. How are you, sir? >> I'm very well. Thank you. How are you? >> I'm very well. You're going to show me how to go and take something like Azure Functions and then host it. Not the way that it's done as a software as a service or functions as a service but hosted rather in Kubernetes, and then autoscale it. There's a lot of interesting Lego pieces you're going to snap together here, and you're going to introduce a new concept, a new project to us called KEDA, that makes it all possible. >> That's correct. >> What we're looking here at our slide? What is auto-scaling or responsibilities here? >> Yes, so Azure has a lot of offerings on how you can host your applications, but the moment you decide how to run them, you implicitly also decide how you will scale them. For example, with Azure Functions, it's serverless technology, they will handle all the scaling for you. So that's super simple and easy, you don't have to worry about anything. But the problem is sometimes you want to have the control on the scaling, so that's not an option, or that's less ideal for you. Then the other option is something like Azure App Service, which is more of a bus service, and you run your application, you can autoscale it with something like Azure Monitor Autoscale, where do you define how it should scale and still use a PaaS offering. But if you want even more control, you can go to something like Kubernetes. But if you run Kubernetes on Azure Kubernetes Service, it's more of a cluster platform as a service, meaning you still need to scale your cluster and your application, and you have to understand how all of that works. With KEDA, we want to make it a lot easier for you and basically give you the simplicity of scaling as it would be on Azure Functions but bring into Kubernetes. >> Okay, so if I understand correctly, Azure Functions is fully managed and it scales based on magic dust that they have, it's their own thing. >> Yeah, that's correct. They have their own scale controller. >> Okay, they have their own scale controller, then Azure App Service, which I'm very, very familiar with, I can go and host Azure Functions in a container and Docker. I can host it on my own machine. I can host it in in-app service, and then it's on me to decide how to scale. I can use Azure Monitor or different things like that. But now if I put that container and that's self-hosting into AKS or any Kubernetes anywhere, it really is on me to start thinking about how to scale, and that's where KEDA comes in. That's not an Azure thing, that's an open-source Kubernetes event-driven autoscaling component. >> Yeah, that's correct. Azure Functions team partner with Red Hat in 2019 because they wanted to bring the same capabilities to Kubernetes, because one of the strategies of Azure lately was really to bring Azure serverless anywhere. As of today, you can already run your functions anywhere. Now we have a preview for Logic Apps. You can bring API management wherever you want. You can run it wherever you want. You can run it on other Clouds, for the multi-Cloud strategy, you can run it on the Edge, on boats, you can run it on-prem. You can basically host it wherever you want. Now if you do that, that also means that you are in control of that scaling aspect, and because of that, the Functions team wanted to make it easier for customers to basically do that. That's the goal of KEDA, make application autoscaling that simple, just define how it should scale and KEDA will manage all the rest. >> Very cool. That sounds like something that everyone would like. >> I hope so. The problem is a bit that if you're using Kubernetes without KEDA, it has some scaling capabilities out of the box. For example, when you want to scale your deployment, you can use Horizontal Pod Autoscaler to scale your deployments. But the problem is that these only support CPU and memory. But if you have a queue worker, for example, you need to have external metrics from Azure Monitor, bring them inside the cluster and scale based on them. So you will need to host your own autoscaling infrastructure and basically use adapters to get the metrics from external systems and scale based on those. Now the problem is you can only have one of those adapters. So if you're using Azure and maybe Prometheus with Kafka, this will not be possible. There are workarounds like you can use it to Promitor to bring Azure Monitor Metrics and Prometheus, and then fully base of Prometheus, but it's a bit cumbersome. With KEDA, we want to make this super simple. So what we do is basically instead of using all of that infrastructure, you just do a simple install of KEDA would help, and we have a full list of scalers where we know how to get the metrics, we know how to authenticate, you just need to give us the thresholds. Then what you can do is you can deploy scaled object, which basically defines how you want to scale and based on what, and we will basically manage all the rest by using the scale controller, which will use the HPA underneath the hood, and we can do zero to N scaling. If there is no work, we will basically remove all the instances and optimize the cost. That's the whole promise of KEDA basically. >> Forgive my ignorance, but is this the kind of thing that Kubernetes is a monolithic thing, but it's also a thing that is composed of many different microservices, so there's always different plug-ins and stuff. Is KEDA the kind of thing you would want built into Kubernetes or is it actually part of the strength of Kubernetes that you can plug something so fundamental in and also plug it into other stuff? >> Yeah, I think it's more of a plug and play model where everybody who needs this can just install it. I don't think it should be in the core because it depends on a lot of external systems, and that's why it's a nice item. Of course it would be nice if this could be integrated with AKS in the future, for example. So if you create your cluster, you just opt in and say, "Hey AKS, already install KEDA for me so I can get it going immediately." >> Interesting. Okay. >> Let's have a look at the demo. In this very basic demo, I have two apps. I have an orders app which is .Net Core worker, which is processing an order skew. Then I have an Azure Function here, which I'm deploying in my cluster. Because we want to optimize for the costs and for compliance, we need to run everything on Kubernetes because that is model in open standards and the customer does not want to have a vendor lock-in, let's say. Why is this important? If they want to run it on-prem or in another Cloud, it's fully portable and that's the beauty of Azure Functions as well. You can run it anywhere. I will go to my demo. On the left, you can see basically my orders portal, where you see the queue depth for the orders and the shipments, and then I have a bunch of CLI's here. Now before we go there, I will quickly go through the demo itself. I am going to close this, and this demo is fully available on GitHub if you want to run it yourself. The order component is just a regular .Net Core worker, where we have an order processor and it will connect to the service queue, very basic. Then I just have one Azure Function in a separate app, which will be connecting to the shipments queue, and those who are using Azure Functions already, we'll see that this is just the same thing. I can go to Azure, deploy this function, and it will work perfectly fine. They will scale everything for me. Now in this case, we want to run it on Kubernetes. So what we did was I added a Docker file to build the application, and I already pushed them to the GitHub Container Registry and I will deploy them. For every application, I have the deployment where it's nothing fancy. You'll just see this is the image. This is my order queue name, and the shipment's queue name to which I will send the shipment requests, and then I have the connection string here as a secret. Secrets are created below, but the point here is that we are using the least privilege principle. So my orders app will have send and receive permissions, the shipments app will only have receive permissions. I'll come back to that later, why that is important. I will deploy that now to the cluster. >> I'm just applying this. >> Is this an AKS cluster or is this a local one? >> This is just an AKS cluster, it can be [inaudible] cluster, it can be clustered on GCP, for example. It works fine on all of them. As you can see, it deployed to two deployments, and you'll see that we are running one instance. What I will do is I will do a watch here. If something changes, we'll be able to see what's going on. Now, before, we can autoscale it. Basically, we need to install KEDA. You can very easily do this with Helm. I will do a Helm install KEDA kedacore, and then I will install that in the namespace KEDA system. This is available on our GitHub repo, so it is a Helm repository which you can easily add. Then you can see that just install KEDA, and we're good to go. What you can do now is say, "Hey, KEDA, show me all the scaled objects," and you will see that there are none. What is a scaled object? A scaled object defines how we will scale our applications. That's what we will do now. I will show you what that looks like. Here, I have a separate deployment file for Kubernetes where we have a secret which is a different connection string with management permissions. Why is this important? KEDA requires management permissions because the service business [inaudible] requires that. If you want to have the metrics, you need manage permissions. That's also one of the beauties, you can fully assign different credentials to KEDA, then your deployments to use that least privilege principle. Now, how will we scale? We do that with the scaled object where we basically use scaleTargetRef, and that is the name of the resource that we want to scale. By default, it targets deployments, but you can basically scale anything in Kubernetes. It can be StatefulSet deployment, it can even be custom resource from another project like Argo CD, for example. In this case, we will just scale out deployment. Basically, we're saying, we want to have a maximum of 10 replicas, and we are using the default value where it means that we will scale all the way back to zero if there is no work. Now, we just have to define how it needs to be triggered. Here, we're saying we want to scale based on Service Bus and we will check the orders queue. If there is more than five messages, add another instance. Then the last thing we do is we refer to an authentication here. That is a secondary source from KEDA where basically, we define how to authenticate. In this case, we use a Kubernetes secret. You could also use Azure managed identity, you can refer to environment variables. We have other things like HashiCorp Vault where you can pull these secrets from. The beauty is that these trigger authentications, you can reuse them for different scaled objects. I will go ahead and deploy this. Before I do that, let's add some work here. I will do the dotnet run, and I have this auto-generated here where basically, we will just send a lot of orders to the queue, basically just to trigger some processing. This case, we'll send 1,000 messages and you see, we're generating all these nice orders here with bogus, a nice library to generate fake data, and you see that the order queue is piling up. Eventually, the order deployment here will be able to cope with them. But because we only have one instance, this will take a lot of time. I will now deployed the autoscaling. You see that it creates a secret, the scaled object, and triggered authentication. Then when KEDA kicks in, you will basically see that it will start scaling out. Until that basically comes in, you can also see it's already scaling out to four instances. It's adding them, starting them up. If they can't keep up as well, it will also add more and more and more until it hits 10, and then when all the work is done, it will scale back. You see the auditor literally through the roof because the chart can cope with them which is actually pretty funny. But you see now, we're at eight. You can fully control how aggressive the scaling is, you can also say, I want to wait five minutes before adding another instance. Now, we're at 10 and then you'll see that the chart will start going down. Now, while we wait for that, you can also see here, I forgot to get. You can also, in the CLI, get more information about those scaled objects. Here, basically, we can see that this scale is going to scale a deployment, and the target name is the name of our deployment. You can see the trigger, how it authenticates, and if it is scaling. As an operator, you can also use this to get an understanding of what is going on. Now, you see that the shipments queue is not really a piling up a lot, but we want to reach zero here. What we are going to do is we're going to do a kubectl apply here, and we're going to add autoscaling for the shipments as well. Again, before I do that, let's have a quick look what it looks like. Here, it's a lot simpler. We just have one scaled object and we can reuse the same authentication that we already defined, so we're reusing that credential. For the rest, it's completely the same. Just define, we want to use Service Bus, we're going to scale on the shipments queue, and for every five messages, we're going to add an instance. I will apply this. You'll see that the queue is starting to shrink. Now, with the scaled object for the shipments, we will also see an additional one if it kicks in. It's starting to add up to three, and then eventually, it will add more and more and more of them. Now, if you paid attention, Azure functions is actually still pretty fast on Kubernetes as well, and it can already fairly keep up with all the shipments load, and you see how fast things are going down now. If we would wait until everything is finished, you would see that all the instances here slowly go back and removes per instances all the way down to three, and that's how easy the autoscaling is. Basically, you can autoscale any application running, you just use the scaled object to point to what you want to scale and how you want to scale, and then KEDA takes care of all the rest. >> Wow. It seems like this is an example of what they call a Cloud native app. This is a real classic example of an app that is resilient and it uses the resources that it needs to get the job done. It backs off, it scales out as needed. Watching that queue come down is really the proof of that. >> Yeah. KEDA here actually serves a bigger purpose as well as making autoscaling of Kubernetes a lot simpler, where you can use KEDA for the application autoscaling like we've seen in this case, who could scale those services across all the nodes. But Kubernetes is cluster technology so we have to ensure that we have enough nodes. That's where the Kubernetes cluster autoscaler kicks in, which is also supported by AKS. It's also your responsibility to do this, so don't underestimate this. But at certain point in time, the Cluster Autoscaler will not be able to cope with that and you'll need to mitigate or overflow, let's say. That's where virtual nodes kicks in, that's another AKS feature which basically allows you to overflow your containers to ACI. While from a Kubernetes perspective or the operators, they are still running on Kubernetes. It's the same look and feel, but you just mitigate the resource starvation, let's say. Now, the whole selling point of KEDA as well is that if the whole or the surface scales all the way back to zero, we basically free up a lot of resources and it allows us to scale down the whole cluster again. By using these three services or components together, you really have the autoscaling sweet spot for Kubernetes, thanks to the serverless features of functions brought to Kubernetes. >> Very cool. It seems like it's on us to make sure that the word gets out that things like KEDA exists as well as we explain as you have done so well, how the architecture works to make sure people understand that Kubernetes is a piece of the puzzle, then event-driven autoscaling in the form of KEDA, the Cluster autoscaler, and then also optionally, virtual kubelet and virtual nodes for overflowing when you have figure black Friday or Christmas ordering rush. >> Exactly. That's a good example. >> Where do I go to learn more about this? >> Yes. You can go to keda.sh for all our documentation, you can go to github.com/kedacore for our projects, and you can go to github.com/tomkerkhove/azure-friday-keda to use the demo if you want to run it yourself. >> Fantastic. We're going to make sure we have all of those links available in the show notes below, so you can check those out. You can find Tom on Twitter and GitHub, of course, and then all of those links including again, keda.sh. I am learning all about how to use Azure Functions and Kubernetes at event-driven autoscaling in the form of KEDA today on Azure Friday. >> Hey, thanks for watching this episode of Azure Friday. Now, I need you to like it, comment on it, tell your friends, retweet it. Watch more Azure Friday. [MUSIC]
Info
Channel: Microsoft Azure
Views: 4,382
Rating: undefined out of 5
Keywords: azure friday, scott hanselman, tom kerkhove, azure mvp, keda, keda maintainer, kubernetes, k8s, kubernetes event-driven autoscaling, azure functions, autoscaler, horizontal pod autoscaler, built-in scalers, containers, event-driven, scale, deploying keda, sample application, scaling, Cloud Native Computing Foundation, CNCF, kubernetes-based functions, function app, triggers, helm, http trigger, event-driven scale, Kubernetes Metrics Server, metrics, metrics adapter, statefulsets, prometheus
Id: TftaxqNFsZY
Channel Id: undefined
Length: 21min 14sec (1274 seconds)
Published: Fri Feb 26 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.