>> Hey friends, are you struggling
to autoscale your applications on Kubernetes or do you want to
run your Azure Functions on it? As we all learned, choosing your compute infrastructure defines how you scale your applications. Tom, an Azure MVP, is here to show me how Kubernetes
event-driven auto-scaling or KEDA makes application
auto-scaling so easy, even I can do it. I'm going to learn how
today on Azure Friday. [MUSIC] >> Hey friends, I'm Scott
Hanselman and it's Azure Friday. I'm here with Tom Kerkhove
and we're going to talk about something exciting
which is autoscaling. I like anything where I don't have to do the work. How are you, sir? >> I'm very well. Thank
you. How are you? >> I'm very well. You're
going to show me how to go and take something like Azure
Functions and then host it. Not the way that it's done as
a software as a service or functions as a service but
hosted rather in Kubernetes, and then autoscale it. There's a lot of
interesting Lego pieces you're going to snap together here, and you're going to
introduce a new concept, a new project to us called KEDA, that makes it all possible. >> That's correct. >> What we're looking
here at our slide? What is auto-scaling or
responsibilities here? >> Yes, so Azure has a
lot of offerings on how you can host your applications, but the moment you
decide how to run them, you implicitly also decide
how you will scale them. For example, with Azure Functions, it's serverless technology, they will handle all
the scaling for you. So that's super simple and easy, you don't have to
worry about anything. But the problem is sometimes you want to have the control on the scaling, so that's not an option, or that's less ideal for you. Then the other option is
something like Azure App Service, which is more of a bus service, and you run your application, you can autoscale it with something
like Azure Monitor Autoscale, where do you define how it should scale and still use a PaaS offering. But if you want even more control, you can go to something
like Kubernetes. But if you run Kubernetes on
Azure Kubernetes Service, it's more of a cluster
platform as a service, meaning you still need to scale
your cluster and your application, and you have to understand
how all of that works. With KEDA, we want to make it a lot easier for you and basically give you the simplicity of scaling as it would be on Azure Functions
but bring into Kubernetes. >> Okay, so if I
understand correctly, Azure Functions is fully
managed and it scales based on magic dust that they
have, it's their own thing. >> Yeah, that's correct. They
have their own scale controller. >> Okay, they have their
own scale controller, then Azure App Service, which I'm very, very familiar with, I can go and host Azure Functions
in a container and Docker. I can host it on my own machine. I can host it in in-app service, and then it's on me to
decide how to scale. I can use Azure Monitor or
different things like that. But now if I put that
container and that's self-hosting into AKS or
any Kubernetes anywhere, it really is on me to start
thinking about how to scale, and that's where KEDA comes in. That's not an Azure thing, that's an open-source Kubernetes event-driven autoscaling component. >> Yeah, that's correct. Azure Functions team
partner with Red Hat in 2019 because they wanted to bring the same
capabilities to Kubernetes, because one of the
strategies of Azure lately was really to bring
Azure serverless anywhere. As of today, you can already
run your functions anywhere. Now we have a preview for Logic Apps. You can bring API management
wherever you want. You can run it wherever you want. You can run it on other Clouds, for the multi-Cloud strategy, you can run it on the Edge, on boats, you can run it on-prem. You can basically host
it wherever you want. Now if you do that, that also means that you are in
control of that scaling aspect, and because of that, the
Functions team wanted to make it easier for customers
to basically do that. That's the goal of KEDA, make application
autoscaling that simple, just define how it should scale
and KEDA will manage all the rest. >> Very cool. That sounds like
something that everyone would like. >> I hope so. The problem is a bit that if you're using
Kubernetes without KEDA, it has some scaling
capabilities out of the box. For example, when you want
to scale your deployment, you can use Horizontal Pod Autoscaler
to scale your deployments. But the problem is that these
only support CPU and memory. But if you have a queue
worker, for example, you need to have external
metrics from Azure Monitor, bring them inside the cluster
and scale based on them. So you will need to host your own autoscaling
infrastructure and basically use adapters to get the metrics from external systems and
scale based on those. Now the problem is you can only
have one of those adapters. So if you're using Azure and
maybe Prometheus with Kafka, this will not be possible. There are workarounds
like you can use it to Promitor to bring Azure Monitor
Metrics and Prometheus, and then fully base of Prometheus, but it's a bit cumbersome. With KEDA, we want to
make this super simple. So what we do is basically instead of using all of that infrastructure, you just do a simple
install of KEDA would help, and we have a full list of scalers where we know
how to get the metrics, we know how to authenticate, you just need to give
us the thresholds. Then what you can do is you
can deploy scaled object, which basically defines how you
want to scale and based on what, and we will basically manage all the rest by using
the scale controller, which will use the HPA
underneath the hood, and we can do zero to N scaling. If there is no work, we will basically remove all the
instances and optimize the cost. That's the whole promise
of KEDA basically. >> Forgive my ignorance, but is this the kind of
thing that Kubernetes is a monolithic thing, but it's also a thing
that is composed of many different microservices, so there's always different
plug-ins and stuff. Is KEDA the kind of
thing you would want built into Kubernetes
or is it actually part of the strength of
Kubernetes that you can plug something so fundamental in and
also plug it into other stuff? >> Yeah, I think it's more
of a plug and play model where everybody who needs
this can just install it. I don't think it
should be in the core because it depends on a
lot of external systems, and that's why it's a nice item. Of course it would be
nice if this could be integrated with AKS in
the future, for example. So if you create your cluster, you just opt in and say, "Hey AKS, already install KEDA for me so I
can get it going immediately." >> Interesting. Okay. >> Let's have a look at the demo. In this very basic demo, I have two apps. I have an orders app which
is .Net Core worker, which is processing an order skew. Then I have an Azure Function here, which I'm deploying in my cluster. Because we want to optimize for
the costs and for compliance, we need to run everything on
Kubernetes because that is model in open standards and the customer does not want to have a
vendor lock-in, let's say. Why is this important? If they want to run it
on-prem or in another Cloud, it's fully portable and that's the beauty of
Azure Functions as well. You can run it anywhere. I will go to my demo. On the left, you can see
basically my orders portal, where you see the queue depth for
the orders and the shipments, and then I have a
bunch of CLI's here. Now before we go there, I will quickly go
through the demo itself. I am going to close this, and this demo is fully available on GitHub if you
want to run it yourself. The order component is just
a regular .Net Core worker, where we have an order
processor and it will connect to the service queue, very basic. Then I just have one Azure
Function in a separate app, which will be connecting
to the shipments queue, and those who are using
Azure Functions already, we'll see that this is
just the same thing. I can go to Azure, deploy this function, and it
will work perfectly fine. They will scale everything for me. Now in this case, we want
to run it on Kubernetes. So what we did was I added a Docker
file to build the application, and I already pushed them to the GitHub Container Registry
and I will deploy them. For every application, I have the deployment where
it's nothing fancy. You'll just see this is the image. This is my order queue name, and the shipment's queue name to which I will send the
shipment requests, and then I have the connection
string here as a secret. Secrets are created below, but the point here is that we are using the least
privilege principle. So my orders app will have
send and receive permissions, the shipments app will only
have receive permissions. I'll come back to that later, why that is important. I will deploy that
now to the cluster. >> I'm just applying this. >> Is this an AKS cluster
or is this a local one? >> This is just an AKS cluster, it can be [inaudible] cluster, it can be clustered
on GCP, for example. It works fine on all of them. As you can see, it deployed
to two deployments, and you'll see that we
are running one instance. What I will do is I
will do a watch here. If something changes, we'll be
able to see what's going on. Now, before, we can autoscale it. Basically, we need to install KEDA. You can very easily
do this with Helm. I will do a Helm
install KEDA kedacore, and then I will install that
in the namespace KEDA system. This is available on our GitHub repo, so it is a Helm repository
which you can easily add. Then you can see that just install
KEDA, and we're good to go. What you can do now
is say, "Hey, KEDA, show me all the scaled objects," and you will
see that there are none. What is a scaled object? A scaled object defines how we
will scale our applications. That's what we will do now. I will show you what that looks like. Here, I have a separate
deployment file for Kubernetes where we
have a secret which is a different connection string
with management permissions. Why is this important? KEDA requires management
permissions because the service business
[inaudible] requires that. If you want to have the metrics, you need manage permissions. That's also one of the beauties, you can fully assign different
credentials to KEDA, then your deployments to use
that least privilege principle. Now, how will we scale? We do that with the scaled object where we basically
use scaleTargetRef, and that is the name of the
resource that we want to scale. By default, it targets deployments, but you can basically scale
anything in Kubernetes. It can be StatefulSet deployment, it can even be custom resource from another project like
Argo CD, for example. In this case, we will just
scale out deployment. Basically, we're saying, we want
to have a maximum of 10 replicas, and we are using the default
value where it means that we will scale all the way back
to zero if there is no work. Now, we just have to define
how it needs to be triggered. Here, we're saying we
want to scale based on Service Bus and we will
check the orders queue. If there is more than five
messages, add another instance. Then the last thing we do is we
refer to an authentication here. That is a secondary source
from KEDA where basically, we define how to authenticate. In this case, we use
a Kubernetes secret. You could also use
Azure managed identity, you can refer to
environment variables. We have other things like HashiCorp Vault where you
can pull these secrets from. The beauty is that these
trigger authentications, you can reuse them for
different scaled objects. I will go ahead and deploy this. Before I do that, let's add some work here. I will do the dotnet run, and I have this auto-generated
here where basically, we will just send a lot
of orders to the queue, basically just to
trigger some processing. This case, we'll send 1,000
messages and you see, we're generating all these
nice orders here with bogus, a nice library to generate fake data, and you see that the
order queue is piling up. Eventually, the order deployment here will be able to cope with them. But because we only
have one instance, this will take a lot of time. I will now deployed the autoscaling. You see that it creates a secret, the scaled object, and
triggered authentication. Then when KEDA kicks in, you will basically see that
it will start scaling out. Until that basically comes in, you can also see it's already
scaling out to four instances. It's adding them, starting them up. If they can't keep up as well, it will also add more and more
and more until it hits 10, and then when all the work is done, it will scale back. You see the auditor literally
through the roof because the chart can cope with them
which is actually pretty funny. But you see now, we're at eight. You can fully control how
aggressive the scaling is, you can also say, I want to wait five minutes before
adding another instance. Now, we're at 10 and then you'll see that the chart
will start going down. Now, while we wait for that, you can also see here,
I forgot to get. You can also, in the CLI, get more information about
those scaled objects. Here, basically, we can see that this scale is going to
scale a deployment, and the target name is the
name of our deployment. You can see the trigger,
how it authenticates, and if it is scaling. As an operator, you can also use this to get an understanding
of what is going on. Now, you see that the shipments queue is not really a piling up a lot, but we want to reach zero here. What we are going to do is we're
going to do a kubectl apply here, and we're going to add autoscaling
for the shipments as well. Again, before I do that, let's have a quick look
what it looks like. Here, it's a lot simpler. We just have one scaled
object and we can reuse the same authentication
that we already defined, so we're reusing that credential. For the rest, it's
completely the same. Just define, we want
to use Service Bus, we're going to scale on
the shipments queue, and for every five messages, we're going to add an instance. I will apply this. You'll see that the queue
is starting to shrink. Now, with the scaled
object for the shipments, we will also see an additional
one if it kicks in. It's starting to add up to three, and then eventually, it will add
more and more and more of them. Now, if you paid attention, Azure functions is actually still pretty fast on Kubernetes as well, and it can already fairly keep
up with all the shipments load, and you see how fast
things are going down now. If we would wait until
everything is finished, you would see that all the
instances here slowly go back and removes per instances
all the way down to three, and that's how easy
the autoscaling is. Basically, you can autoscale
any application running, you just use the scaled
object to point to what you want to scale and
how you want to scale, and then KEDA takes
care of all the rest. >> Wow. It seems like this is an example of what they
call a Cloud native app. This is a real classic
example of an app that is resilient and it uses the resources that it
needs to get the job done. It backs off, it
scales out as needed. Watching that queue come down
is really the proof of that. >> Yeah. KEDA here actually
serves a bigger purpose as well as making autoscaling
of Kubernetes a lot simpler, where you can use KEDA for the application autoscaling
like we've seen in this case, who could scale those services
across all the nodes. But Kubernetes is cluster
technology so we have to ensure that we have enough nodes. That's where the Kubernetes
cluster autoscaler kicks in, which is also supported by AKS. It's also your
responsibility to do this, so don't underestimate this. But at certain point in time, the Cluster Autoscaler will
not be able to cope with that and you'll need to mitigate
or overflow, let's say. That's where virtual nodes kicks in, that's another AKS
feature which basically allows you to overflow
your containers to ACI. While from a Kubernetes
perspective or the operators, they are still running on Kubernetes. It's the same look and feel, but you just mitigate the
resource starvation, let's say. Now, the whole selling
point of KEDA as well is that if the whole or the surface scales all
the way back to zero, we basically free up
a lot of resources and it allows us to scale
down the whole cluster again. By using these three services
or components together, you really have the autoscaling
sweet spot for Kubernetes, thanks to the serverless features of functions brought to Kubernetes. >> Very cool. It seems like it's on us to make sure that the word gets out that things like KEDA exists as well as we explain as
you have done so well, how the architecture
works to make sure people understand that Kubernetes
is a piece of the puzzle, then event-driven autoscaling
in the form of KEDA, the Cluster autoscaler,
and then also optionally, virtual kubelet and virtual nodes for overflowing when you
have figure black Friday or Christmas ordering rush. >> Exactly. That's a good example. >> Where do I go to
learn more about this? >> Yes. You can go to keda.sh
for all our documentation, you can go to github.com/kedacore
for our projects, and you can go to
github.com/tomkerkhove/azure-friday-keda to use the demo if you
want to run it yourself. >> Fantastic. We're going
to make sure we have all of those links available in
the show notes below, so you can check those out. You can find Tom on Twitter
and GitHub, of course, and then all of those links
including again, keda.sh. I am learning all about how to use
Azure Functions and Kubernetes at event-driven
autoscaling in the form of KEDA today on Azure Friday. >> Hey, thanks for watching
this episode of Azure Friday. Now, I need you to like it, comment on it, tell your
friends, retweet it. Watch more Azure Friday. [MUSIC]