Understanding StatefulSets in Kubernetes

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

what is up YouTube and welcome to another video in this video we're gonna be talking about community stateful sets in order to understand a stateful set you have to understand what state means so in software all software has some degree of states in some applications it's not obvious because the state of the application does not really impact us those are usually called stateless applications other applications like databases message queues caches have a much higher degree of state because they rely on storage in this video we're going to be taking a look at stateless applications state full applications and then we're going to take a look at how kubernetes helps us with stateful applications with the concepts of state full sets and then finally we're gonna end up with a real-life example where we run a Redis cluster in a stateful sent on kubernetes the code is down below so you can follow a lot and without further ado let's go so what is states in software so let's say we have an application this could be an asp.net a PHP application some Apache web server it doesn't really matter we have our customer and Bob let's say for example this server hosts our company website now Bob opens up the browser and types the address of the website in the browser the browser will then hit our server the server will load HTML files from disk and send it back to Bob's browser so Bob would reach and see our website and Bob is happy but what if this server dies that is why in web applications we normally have a second web server and we also have the same HTML files deployed to the second web server then we use this thing called a load balancer in between and we could send 50% of traffic to the one server and 50% of traffic to the other server that means if one of them dies our application is still available now the HTML files on the server is some degree of State now we can see even if the load balancer is requesting files from different servers it doesn't really impact Bob now although our application here has a very low degree of State meaning that the state doesn't really impact Bob this here is a stateless application so because our application is just serving basic HTML files there is no state that ties Bob rouser to our server that means our application is stateless I always encourage people to build stateless application because managing state is hard now let's take a look at a state full scenario so let's say Bob has to login to our website so Bob types in a username and a password usually when a login happens on a website we generally take the username and password after we've authenticated we create this thing called a session ID and a cookie that we store in the browser when the authentication happens on a web server though the web server is what generates the session and stores it in memory or on disk so now any subsequent request that comes into this web server the web server can identify that decision belongs to Bob and can serve the HTML files to Bob now what happens when this server dies or what happens if the traffic comes to the other server and there is no session available here well Bob will be logged out and Bob will have a bad day now that is because the state of our application is tied to a web server that's because the login information is tired from the browser directly to the server which stores it in memory this is not a good architectural design now it is always a better architectural practice to decouple the stay from this web server process so in this case what we may do is deploy a data store this could be a memcache Redis it could be a database whatever technology you prefer to manage and then what we would do is we would remove the session store from the process and we would save the session in the datastore now the two servers can come and go as they please since we have now decoupled the stay from the process and each one of these web servers can now access the same state in the datastore so now we're back at the point where we have a stateless application and this over here is called a stateful application now stateful applications like Redis memcache are databases no sequel or sequel may use memory for fast access to data but it usually persists the data somewhere on disk to make things a little bit more complicated some of these stateful applications like databases may use its own replication technology to replicate data among multiple instances to achieve high availability now let's take everything we know and let's try to imagine us deploying a sequel database onto a kubernetes cluster so you're probably wondering why don't we just use a kubernetes deployment to deploy our sequel database well a lot of people don't understand the differences between the deployment and a stateful set so to understand this let's take a look at the deployment so let's take a look at a kubernetes deployment so in kubernetes you have this thing called a node or a virtual machine that you want to run some workloads on so in a deployment what we do is we normally create a pod now our pod as you can see has a random name and a random hash behind the pot name and they say we're deploying the sequel an instance to this part we'll have a sequel container running in the pot now by default sequel is not aware that it's running inside of a container basically it's just going to write to the filesystem the way any other database process will don't write to the file system and what happens is the container has a virtual file system attached to it so that means any data that's written inside this filesystem is lost when that container is destroyed now with containers we normally solve this problem by mounting the fast from the host into the container this is called a docker volume so if you do docker running you put the minus V flag you can mount a host bath into the container path so if we take a look at this in kubernetes we can actually mount files from the node into the container and this impunity is called a persistent volume of type host path now kubernetes has many different types of persisted volumes I'm not going to cover all the different types of persisted volumes as this is a stateful set video but it's important to understand basics of persisted volumes and how they play a role in stateful workloads so in this example here we're using a host pass to simulate a docker volume that what we're doing is now sequel is still accessing the files on the file system but it's actually accessing the files on the host because the host is being mounted into the container filesystem so this is a hundred times better than our previous solution because now our data is persisted on the node meaning of the container dies and gets recreated the datastore person and this might be OK for some small businesses ad have a one node kubernetes cluster and maybe they're running non-business critical workloads but it's not good enough for business critical systems if a machine gets recreated the data is lost if the if the machines host operating system files file disk gets full the operating system is going to become unstable also if the pod gets scheduled on another node if you're running more than one node in the cluster then the data is not going to be on that server so you're gonna have again data loss now a much better improvement would be to take a look at other types of persistent volumes as I mentioned before there are multiple different types of persisted volumes you have host paths you also have different types of persisted volumes that you can either attach to a single node like an SSD drive or you can attach a network share like an NFS file storage every cloud provider also has their own set of persisted volumes like Asia has Ezard does Amazon has elastic block storage and then you find open source types persisted volumes light safe and storage OS so in this example I'm going to show you and this is bring that our own storage here like an NFS network ship and what we do is we then use a different persisted volumes rather than host bar and now we can mount the files onto that share so what we've done here is we've enhanced the persisted volume by now instead of coupling ourselves to the container we've moved on to the host bath and instead of coupling ourselves to the node we've advanced into an a external type persisted volume so now we no longer rely on the container or the host so now we are in a much better place if you're still probably wondering what is wrong with the deployment what is wrong with using a deployment here and a multiple pods to store data but there are still some fundamental problems with the kubernetes deployment in the solution so what are the problems well if we take a look at this the first problem is what if we scaled up to two pods so what if we created another pod XYZ now the first thing that's gonna happen is when dooba native scales a deployment the pods are going to use the same persisted volume that is because by default pods as part of a deployment share the same persisted volumes across the cluster though this is a problem for sequel because what's gonna happen is we're gonna have two instances of sequel and they're gonna write to the same storage and what will end up happening is odd a will write to the same storage as pod B which will result in a data corruption there our second problem here is that from a network point of view hot ABC and pot XYZ don't have a reliable way of hauling one another over the network and it's because by default pods and kubernetes don't get their own dns names and usually you use what's called a service to expose pods to other applications in the cluster so that means sequel in this case is going to have a difficult time trying to set up a cluster and replication to pod ABC and pot XYZ is not going to be able to make Network calls to each other to form a master-slave replication so by default - the only way to get traffic between pods is to create this thing called a service now this is okay if you have a client that wants to read the data and sequel and because a services load balanced between pot but it doesn't help us if pot ABC needs to make a network or two pot XYZ the other problem here is that we if we take a look at the pot names and pods get a random hash assigned to the end of the pot name so the pots don't really have a network identity they don't have a dns name and every time a pot gets destroyed and recreated it gets a new randomized name and we all know from a data replication point of view that is not going to be really good if you have something like a Redis cluster where you have a node that gets created joins a cluster and then when it gets deleted and recreates that it has a different name it's going to be hard for all the different instances to find each other in a cluster so pods and stateful and stateful work letters need some form of network identity that's persisted among restarts now the final problem that I want to show you is if we scale up so let's say we scale up to five what a deployment will do in kubernetes oh it will just randomly issue and three more pods here so this is because in kubernetes deployments just randomly scale up a number of instances and also when we scale down it will randomly terminate certain instances and this is bad for something like a Raiders cluster where you might have like a very delicate need for scaling so you might want to create one part at a time so that they can join the cluster and form part of the replication the same thing goes for scaling down so if we scale the solution down to one part what will happen is humanities will just terminate them in random order and probably terminate them all at the same time and if you try to scale a Redis cluster down to one at the same time you're gonna have a big outage so just to recap you can see firstly pods in a deployment share the same persisted volume they can leave pods have a random hash character at the end of the name so they don't have persistent identity they also don't have their own DNS name so they can't be individually addressable and deployment scaling doesn't really help stateful workload now if we updated our deployment to be a stateful set of they say three replicas let's take a look at what improvements kubernetes have made to allow us to deploy stateful workloads so firstly if we say we want three replicas the stateful set will not issue three pods at once instead it will create our pods one by one in a very Billy good fashion so here we will get a pod and the pod will also gets its own DNS name so it'll start with the name of the pod and then the ordinal which will start at zero so every time we scale up the ordinal number will increase but the cool thing here is that if pod zero gets destroyed and recreated it will be recreated with the same DNS name so that whoever wants to call it as part of a cluster will be able to address pot zero again from a network point of view even though the IP address might change you can also see now that the pod name doesn't have a random hash it has an ordinal number that starts from zero and then the other thing we'll notice here is that every part is part of a stateful set gets its own persisted volume and this is one of the fundamental differences between a stateful set and a deployment deployments share persisted volumes stateful set have their own persisted volumes between pods in kubernetes we can use the dynamic storage provisioner to dynamically provision persisted volumes when we scale up so now we can see it since we asked kubernetes for three replicas the next thing it's gonna do is it's gonna create this the second pot and you can see the ordinal number over here has increased to one and what happens as kubernetes will then dynamically provision that storage so now we have two pods and they're both individually addressable so if we were running a Redis cluster here pod zero and pot one could be the master node and form a cluster and join each other as part of the scaling process so it's also good to notice here that the stateful set basically creates pods one at a time and make sure that each part is completely ready it has its persisted volume it has its DNS record available and its network identity is persistent before it creates the the next part so if we had another node and this thing would happen there so the third replica might start on another note and it will get its own and DNS record its own Network identity and the ordinal will go up to two and it'll get its own persisted volume storage as well so you can see the scaling process and the deployment process of a scale set is a lot more delicate in a deployment and the other thing I wanted to show you guys as well is a stateful set allows you to retain these persistent volumes when you want to scale down so let's say we scale back down to two one pod so what will happen here instead of like a deployment just terminating all the pods the stateful set will basically start at the highest number and then terminate them gracefully so we can see here our pod 0 2 has been terminated but the storage is still been retained so it allows us to keep the storage and then the next thing that puberty's will do is it will then take the next pod in the in the list and then scale that one down also retaining its storage so you can see there's a delegate scaled down process so kubernetes will take the highest number terminate that pod wait for that termination to be done and then move to the the lowest number in order now the same thing applies then when we scale back up even any as part of a scale set will delicately scale one by one so it'll bring that pot back and notice that the pod 0 1 has the same DNS record the same name even though the IP address might be different it has the same DNS name so it's identity as well as its network identity has been persisted as well as the storage that's gonna be mounting the same persistent volume in again and then when we scale up the same thing is going to happen here with the last pot in the list so you can see that it's a lot more delicate in terms of its scaling up and scaling down processed with trying to persist the data persist the pod name and persistence network identity so one thing I want to make clear and really clear do not look at kubernetes as a solution to make your application stateful well what do I mean by that as I mentioned earlier data stores like Redis Cassandra MongoDB Postgres MySQL have clustering features if you're planning to build a Cassandra a MongoDB a post-race a MySQL Redis cluster you need to know what you're doing kubernetes stateful sets is not gonna help solve the clustering components for you and you need to know how that data store will behave if the storages move detached or distributed by understanding how the clustering and that application works you can apply what you've learned about persistent volumes and stateful sets and come up with a great solution so kubernetes can totally work for stateful workloads if you know what you're doing the other thing I want to mention is that you need to choose the right persisted volume for the right job if you need high-speed read and write storage you're gonna have trouble having a network share and you might benefit more from using something like an SSD storage so make sure you educate yourself and read up about the different persisted volumes available and the final thing I want to say is that if you're running in the cloud there is a high chance that your cloud provider will already have a storage offering available for you so if you're running in the cloud they might already have a Postgres database as a service or in Redis database as a service this may cost a few dollars per month or even a couple of hundred dollars per month but it might be totally worth it as it'll take the burden away from your engineering team so that they can focus on what's important and not get stuck with data replication and high availability but let's take a look at a real world Redis example running Redis in a stateful set right so let's take a look at a real life example of a stateful safe running Redis on kubernetes so for those of you who not familiar with my channel everything I do is on github so the docker development youtube series have a kubernetes folder inside here I have a stateful said folder with stateful set gamal and as well as notes and an example application so in my notes file I have all the commands listed here that we're going to run now so if you want to follow along all the commands are recorded for the first thing we're gonna do is we're gonna say cube CT I'll create NS that's going to create a namespace which is gonna hold all the resources that we're about to deploy the next thing we're gonna do is check the storage class now the storage class is basically all the provision is for storage that runs within kubernetes so if you're running and kubernetes on docker for Windows or you're running kind or Shipyard or K 30 years or you're running in the cloud there will be a storage provision of for host path type volume so you can see here I'm running and docker for Windows I say gives it to y'all gauge storage class and I have a provision of for host path their name is what we want to record which is called host path and we'll take a look at that in a second so what I'm gonna do is deploy a stateful set which is gonna showcase everything I showed you guys and spoke about earlier as well as it's an example app that is gonna be talking to our readers cluster if they can look at that so if I say cube CTL in the example namespace and I say apply I can apply my stateful C file now before we do that let's take a look at what that file looks like so on the left hand side here in the kubernetes folder in the stateful since folder we have stateful said yeah more now you probably noticed that a stateful CR ml structure is almost identical to a deployment the only subtle difference is that we need a service name so we need to define a service that's going to be exposing our pod then what we have is we have a number of replicas and we're running an image so a normal pod spec that you can see here that we're running an image of Redis 5.0 we're exposing a port we have an entry point to start up our Redis and we have an environment variable to get the pod IP address and then here's the crucial part or our Redis to persist data we have a mount path through a folder called slash data that means Redis will write its files into a filesystem on a folder on slash data but the way it will persistent data is through a volume claim so what we have here is a volume plane this is going to use the dynamic provision of kubernetes to provision of volume for each one of the pods that get created so here we're gonna create a volume called data and it's gonna have this access mode which is read/write and then it's gonna use the storage class name that I mentioned earlier this is the main you want to grab when you do cube CT I'll get storage clock our name was called host path yours might be something different depending on your docker installation depending on your kubernetes cluster insulation where you're running and we're gonna go and request 50 megabytes of storage inside of this volume the other bit that we have down here is just a simple service this is a service that is required for the stateful set to be exposed so we can expose our readers and instances to other applications so we're exposing this port and we're exposing this for now the next bit is gonna happen pretty quickly so what I'm gonna do is I'm gonna say cube CT I'll apply in the example namespace I'm gonna apply this camel then what we're gonna do is we're gonna quickly say cube CT I'll get pods and we're gonna see that each pod gets created one by one so a stateful set allows each pod to become ready before creating the next part or if I do QC t I'll apply and then I do QC t I'll get pods you can see we have one creating if I run it again we can see we have two and as I'm running this we can see it's increasing right we have three we have four and eventually we're gonna have six now if I run cube CT I'll get PV you can see the same thing is happening for volumes kubernetes will create the pod it'll create the volume it'll assign that volume to that pod and then move on to the next part that is the beauty about stateful sets it helps us to scale much more delicately than a deployment so now we can see we have six instances of radius running the next bit is we're gonna create an example app but before we do that we have to enable the Redis cluster so I have these three commands so the first thing we're gonna do is we're gonna grab the IPS of the pod so we're gonna grab that into an environment variable because by default the Redis cluster has not been created we only have 16 rhiness nodes we have to go and set up the cluster so then the next command what we gonna run is we're gonna say cube CTL exec into one of the reddest pods and we're gonna run this cluster create command that's gonna set up our ArrayList cluster we type yes and it's gonna sing the cluster meet messages all the nodes are gonna join the cluster and we're all good to go and to confirm that I have another command here and basically that command will also run an SSH come on on the on the first node and it's gonna say Redis CLI cluster info this is gonna grab the states of the cluster we can see the cluster is okayed and there are six nodes heart of the cluster now in this example the first three nodes are gonna be master nodes and the last three nodes are gonna be slave nodes if you want more information on the Serena's cluster feel free to check out this Rancher blog this is where I've got the information from and this is just an example that I needed to showcase how persisted volumes stateful sets and clustering works now the next bit i'm gonna deploy is this example application this example application is pretty much just the counter so we're gonna hit a website and every time we hit the website the counter is going to increment by one and it's going to store that increments in the database so what we gonna do is we say QCT i'll apply it in the example namespace we're gonna deploy this example app you know if i head over to the browser we can see our application is up and this is the counter so every time i refresh this and the counter goes up so we can see i've hit at ten times so now the counter is stored in the database so we can go and test our capability to persist data and now I want to show you guys what the stateful set is capable of doing so if we say cube CT I'll get pod we have six ports now what I'm gonna do is I'm just gonna run a command to say the lead pods and I'm just gonna randomly delete a few pots or delete the first one and I'll that one there now that's gonna go ahead and delete those pods they're gonna terminate gracefully one by one from the highest number to the lowest number but notice when I go to my browser and i refresh this I've not lost any data we're now on eleven so I can go ahead and go all the way up to twenty we have no dates a lot as we have a cluster set of data is persistent on a persisted volume and there's no state in the process itself the process will be killed recreated by kubernetes if we do cube CT I'll get pods again we see we have our six node up and running with no dates a lot now what if i scale this cluster down so let's take a look what happens if I say cube CTL in the example namespace I say scale stateful set Redis cluster and I say I want to go down from C to four replicas so if I do that and I do get pods we can see it's terminating the pods one by one and then eventually now we'll only have four pods running and if I go to my browser I have still no data loss right so Redis is taking care of the replication and it's storing it on multiple pods and it's synchronizing it with those straightest instances and storing it and persistent volume the other thing you want to notice as well is the persistent volumes are all retained they're all still there so the data is still there still have six persisted volumes even though I only have four processors that means when I go and I scale back up to six kubernetes will create one by one and again containers and mount them to the same volumes not new volumes so there's no data lot go back to our browser we can still keep going and I can go all the way up to 14 so this shows you how persistent volumes work in conjunction with stateful sets and clustering now I want to show you guys a little bit more severe example so what if we scaled this down to one so what I'm gonna do now is I'm gonna say Gala replicas down to one so let's say we had a bigger outage and almost all the parts are were terminated if and now we can see we only have one pot running and if I do get persisted volume we can see we still have all the data retained so this the persisted volume is doing its job of keeping the data in the volume the stateful centers is doing its job by keeping processors mounted to the right persisted volume so now let's go and take a look at our application so if i refresh this counter now look what happens now we've all that we've lost connectivity with our readers data store so what has happened under the hood well let's go ahead and run this command to get the status of our readers cluster so I'm gonna say qct Aleksic on that on the first node of that Redis incident I'm going to say clustered info and now we have a cluster failing failure we also have cluster size of six nodes but now the cluster is down so this goes to show that when you storing state on kubernetes it's not all about persistent volumes stateful sips kubernetes doesn't do the job of making sure our cluster on our radius is healthy but now we pretty much have a Redis outage and this is where your engineering team is going to have to have the knowledge of how to get that cluster repaired and get it back up and running so it's not always just a matter of saying okay let's scale back up to 6 if I do that and I'd leave QCT I'll get pods we can see that kubernetes will go and recreate them gracefully and bind them to the same storage so technically speaking we shouldn't have an outage right but still if i refresh this we have a cluster outage cluster is down so even though we have all our instances running it takes extra engineering overhead to make sure that you understand the clustering of raiders and how to repair that cluster when it goes down so I hope this video helped explain and clear up and demystify how stateless and stateful workloads run on kubernetes and this will help you make more smarter informed decisions around how to run stateful workloads on clipping edge so remember to check out the links down below for more videos on my kubernetes development guide as well as other videos in the DevOps face and let me know down below in the comments what sort of videos you'd like me to cover in the future and how you manage state on kuben 8 and remember to like and subscribe and until next time peace [Music]

Info

Channel: That DevOps Guy

Views: 16,662

Rating: undefined out of 5

Keywords: statefulset, kubernetes, persistentvolume, pv, state, databases, storage, k8s, devops, learning, tutorials, guide, beginner, docker, containers, sql, redis, aws, azure, gcp, cloud, course, what, is, persistent, aks, eks, gke

Id: zj6r_EEhv6s

Channel Id: undefined

Length: 28min 44sec (1724 seconds)

Published: Mon Jun 29 2020