When you make a pod in Kubernetes, what
happens behind the scenes? How do those Kubernetes components work together to bring that pod into fruition?
My name is Whitney Lee, I'm on
the cloud team here at IBM.
So, this is a cool exercise. It kind of goes over all the basics of Kubernetes, but with the perspective of a pod being made. So, at the basis of our system we have our nodes. The nodes are the worker machines
that that back up the Kubernetes cluster. So, in Kubernetes there are two types of
nodes: we have the control nodes - and in our case we're going to do just one control node
for simplicity's sake, but you in a production level cluster you'd want at least 3 control nodes - and then we have the compute nodes. So, in our exercise here, we'll do two
compute nodes, but you could have many many compute nodes in a Kubernetes cluster. So, let's talk about a use case where we want to make a pod. Let's say I'm the one making a pod. So, here's me - my smile, long hair, ha ha, my bangs ... okay. So, I’m going to make a pod in Kubernetes, and when I make that call with a kubectl command.
That's going to go into the Kubernetes
cluster and hit the Kube-API server. So, the Kube-API server is the main
management component of a Kubernetes cluster.
Now, the first thing that the Kube-API server
is going to do with my request to make a pod, is it is going to authenticate it and
validate it. So, it's going to say, “Who are you? Do you have
access to this cluster?” Like, “Oh, you're Whitney! Cool, we
know you. Come on in, make a pod!” So, the next thing that happens is the Kube
API server is going to write that pod to etcd.
etcd is a key value data store
that's distributed across the cluster and it is the source of truth
for the Kubernetes cluster. (Word on the street is we have a
really good IBM Cloud video about it)
So, Kube API server writes that request to etcd,
and then etcd will return when it has made a successful write, and then, at that point, the Kube-API server already is going to return to me that it's created - even though not a lot has happened in our system yet. That's because, at its core, Kubernetes and etcd has defined a desired state of what the system looks like, and then all the Kubernetes components
that we're going to talk about today work together to make that desired state equal
to the actual state. So, now that we have the pod recorded in the desired state, it is as good as created, as far as Kubernetes is concerned. The next component I want to
talk about is the scheduler. The scheduler is keeping an eye out for
workloads that need to be created, and what it's going to do is determine which
node it goes on. But what it's doing in the short term is, it's pinging our Kube-API server at regular intervals to get a status of whether there are any workloads that need to be scheduled. So, usually like five seconds. So, the Kube scheduler is ... the scheduler,
excuse me, it's going to ping the Kube API server, “Hey, do we have any workloads
that need to be created? No? OK.” “How about now? Are there any workloads now?
No? All right.” “How about now?” “Oh, Whitney has a pod that needs
to get created. Let's do that."
So, now that the scheduler knows that pod needs
to get created on one of our compute nodes, let's take a pause from this and talk
about our compute nodes for a moment. Our compute nodes have three major
components: one is a Kubelet.
The Kubelet is how the compute node
communicates with the control plane, or with specifically the Kube-API server. So, each each compute node has a Kubelet. So, the Kubelet is going to register the node with
the cluster. It will send periodic health check so that the Kube-API server knows that our compute nodes are healthy, and it will also create and destroy workloads as directed by the Kube-API server. Each of our compute nodes is also going to have a container runtime engine that's
compliant with container runtime initiative, and so, in the past, it's been Docker but it could
really be anything that's compliant - and then, finally, it has a Kube proxy which
isn't needed to create our pod today, but I would be remiss if I didn't mention it - and the Kube proxy is going to help the compute nodes communicate with one another if there are any workloads that span across more than one node. Just generally it helps them communicate.
Okay, that said, now we have our scheduler. A scheduler is aware that we
need to schedule Whitney's pod. What our scheduler is going to do is look at the
available compute nodes. It's going to rule out any that are unsatisfactory either because of limitations that maybe the cluster administrator set up, or maybe it just doesn't have enough space for my pod, and then of the ones that are left it'll choose the best one to run the workload on, taking all the factors into account. Once it has made that choice,
does it schedule the workload? No, all it does is tell the Kube-API
server where it should go.
Once the Kube-API server knows where it
should go, does it schedule the workload? No, what it does is it writes it to etcd, and then after the successful writes, then we have the desired state versus the actual states and
the Kube-API server knows what it needs to do to make that desired state ... the actual state
meet the desired state and what that is, is that's when the Kube-API server is going to let the Kubelet know let's say the scheduler said we should run the pod on node 2. So, that's when it's going to let the Kubelet know, “On node 2 we need to spin up a pod on this cluster.” The Kubelet is going to work together with the container runtime engine and make a pod that has the appropriate container running inside. So, we have made a pod on a Kubernetes cluster, but there's one more management
piece I want to talk about.
Let's consider a case where, when I made the
pod, I set the restart policy to “always”, and then let's say my pod - something happens and it goes down. How will the system know that I want a new pod to be created in its place? That is where the controller manager comes in. This is the last important component of Kubernetes. So, the controller manager ... ... it is made up of all of the controllers. So, there are many controllers that are controlled by the controller manager. And, in particular, the one that's going to help me make a new pod. For me. I'm not doing anything at
this point, my job is done up there, but the controller manager, it's the replication
controller within the controller manager that's going to help with this task. So, the controller manager - all the different
controllers are watching different pieces of the Kubernetes system: the replication controller, just like the scheduler - these controllers are pinging the Kube-API server at a regular basis to get an update on the actual state of the cluster to make sure the desired state and the
actual states are the same as one another. So, the replication controller sees,
from contacting the Kube-API server, sees that my pod is gone, and it will take the necessary steps to spin that pod back up, - or create a new pod, honestly,
because pods are ephemeral.
So, in conclusion, all these components
are working together just to make my pod - and especially we have the Kube-API server, the
main management component of the cluster, we have etcd, our data store and our source of truth for the cluster, the scheduler that helps determine which of the compute nodes the workload should go
on to, and the controller manager that is watching the desired state - the actual state - and making sure it's the same as the desired state. Thank you. If you have any questions
please drop us a line below. If you want to see more videos like this in
the future, please like and subscribe. And don't forget: you can grow your skills and earn a badge with IBM CloudLabs, which are free browser-based interactive Kubernetes labs.