[ Kube 1.5 ] Set up highly available Kubernetes cluster step by step | Keepalived & Haproxy

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello viewers welcome to this video in this video we're gonna be looking at setting up a highly available kubernetes cluster all right so a year ago i did this video cube idiom hachay with multi master so all i did in this video it's exactly a year ago anyway i want to redo this video because i didn't do that properly so what i did in this video cube 1.3 is basically this we had two master nodes and all of our interactions was through this load balancer we set up a proxy in this load balancer node we have the hachi proxy configuration that load balances to multiple master nodes and k worker communicates and every single kubernetes components communicate with the api server through the load balancer and again your cube cdl commands will go through this load balancer you can already notice some problems here so there's this load balancer which is a single point of failure so we need to make this load balancer redundant so that's what we're going to be looking at in this video so if i scroll so this will be our new setup so for this video my setup is so i'm running this entire video on my laptop which is dell xps 9310 running rs linux has 8 cores and 16 gig of ram and for this video all my virtual machines i'm going to be provisioning using kvm liver but the vagrant file i've got in my kubernetes repository which i'm going to show you in a minute work both with virtualbox as well as delivered but for this demo i'm going to be using libre but the process is going to be exactly the same what we are trying to achieve is a truly highly available kubernetes cluster we're going to be using a tool called keep a live d in addition to hachi proxy so if you look the control plane we already have multiple masters so that's what our setup is going to look like so we have three masters at the minimum so we need to have at least three masters and then instead of one load balancer now i'm going to introduce another load balancer for redundancy and then we're going to be using something called a virtual ip so that virtual ip can travel between load balancer 1 and load balancer 2 depending on which one is available and that is done using keep a live d so all the components within your kubernetes cluster will interact with the api server through this virtual ip and this virtual ip will be assigned to one of these two load balancer nodes okay so we have high availability for the load balancer plane as well as the control plate so if load balancer 1 goes down still the load balancer 2 can take the load and this virtual iep has got logic in it it gets assigned initially to one of these two load balancer node and if a load balancer node goes down the virtual ip shifts to another load balancer automatically so you can continue to run your cube ctl commands and all your cluster components can interact with the uh api server so all the virtual machines here are going to be ubuntu 2004 and i'm going to be using kubernetes version 1.22.0 all the kubernetes nodes master nodes as well as the worker nodes will have 2 cpu and 2gig of ram and load balancer i have assigned one cpu and a half a gig of ram you can have as many load balancers as you want and you can have as many masternodes as you want if you just follow the process that i'm going to show you you can just extend your kubernetes cluster to depending on your requirements you can have more load balancer notes but for this demo we're going to have just two load balancer and minimum highly available kubernetes cluster setup so let me show you my github repository just be an open source kubernetes i'll put a link to this in the video description cubadium hutch a keep alive the hatchet proxy and inside that there's a directory external keeper live dha proxy so there are two ways you can achieve this either you can have a dedicated operating system dedicated system running hachiya proxy and keep a live deal which is what we're going to do which is what we're going to be doing in this video or you could you can have a proxy and keep a live d running as static parts within each of these master nodes so you can entirely get rid of this so i'm trying to do that in my next video just for completeness sake i'm also going to show that how you can do that okay so if i show you my vagrant file here you can see so there's this load balancer count if you want more load balancer you can just update this variable and i'm using ubuntu 2004 box version is 3.3.0 and 172 16 16 51 and 52 or the ip address for the load balancer nodes and 172 16 16 101 to 103 for the control planes and 201 for the k worker right so that's load balancer node and master count is three so we will be bringing up three master nodes again it's everything is the same and one worker node so all uses the same vagrant box ubuntu 2004 and here you can see i've got a couple of blocks for virtualbox and library if you want you can do vagrant up that will bring up all your virtual machines in virtualbox but if you want like what i'm going to do in this video i'm going to be using liver provider so you just need to do vagrant up minus minus provider liver if you want to bring all your virtual machines in kvm library if you're running it on a linux host okay so that's the vagrant file and i've got a bootstrap script a little bootstrap script let me show you what's that bootstrap all i'm doing is i'm enabling password authentication and i'm also setting up the root password to be cube admin for all my six virtual machines okay so let's go back to the the readme documentation here so the thing to note here is we will be using 172 16 1600 as the virtual ip and if i show you the notes here so this is the virtual ip that we're going to be using all right so first thing is let's bring up all the virtual machines so let me first go to my play directory and i'm going to git clone my kubernetes repository cd to kubernetes and then cd to queue better cube adm hachi keep a live dha proxy external keeper id proxy and here we have our vagrant file and bootstrap all right so if i do vagrant up minus minus provider libre it's going to provision all those six virtual machines for me and it's going to take a couple of minutes i'm going to pause the video and come back when it's all done all right the command completed and it took less than a minute actually and if i do brush list let's see two load balancer three master node and one worker nodes six virtual machines all running fine let me also open this up in red manager so you can see that here and i can also do vagrant status okay so we have all of our virtual machines running let's continue following the documentation so i did vagrant out provider library and now first thing first we're going to be setting up our load balancer nodes we're going to be installing hachi proxy keeper id we're going to have a couple of configurations for keep a live dna proxy and see how this uh virtual ip actually works okay right the first thing is install keep a live dna proxy let's do that and i'm going to be doing this on both load balancer 1 and load balancer 2 so for that i'm going to open up another pane here and let me ssh to 51 and 252 so that's load balancer 1 on the top pane load balancer 2 on the bottom pane and i'm going to synchronize my pane so if i run the command on one it's going to get executed on both the pages as well right the password is cube admin that's coming from the bootstrap script the first thing is we're going to be installing keep on live d and hey proxy let's install that all right keep a live dna proxy packages installed and now configure keeper live d so what we're doing here is we are creating a check script a health check script in atc keep an id check api server dot shell and if you look in here what we are basically doing is keeper id is going to use this health check script to find out if the api servers are reachable from the control sorry from the load balancer node so basically from this load balancer uh when we do the hachi proxy configuration this load balancer will be proxying request to one of the three master nodes in a round robin fashion right so we need to make sure that the load balancer can talk to the master nodes via the hajj proxy configuration so that's the check we are doing in this check script and if the check script fails the load balancer the keypad id assumes that this load balancer has problem communicating with the control plane nodes so it will switch the virtual ip to another load balancer and similarly if that load balancer has got problems accessing the api server the virtual ip will get switched to one of the other load balancer in our case we only have two load balancers okay let's go back to the documentation and run this so basically creating the script and setting the execute permission executable permission for that script okay so copy that and paste it here right so that's just the health script and here is the actual keeper id configuration it's a very simple keeper id configuration you can see the virtual ip address here 172 16 1600 which will be managed by keeper id and we have this track script we specified that to be check api server and that is here check api server this code block and when it comes to this code block the script will get executed so let me explain you what this check is going to do actually so interval is three so keeper id is going to run this health check script every three seconds time mode is set to 10 seconds so if it waits longer than 10 seconds for example the hachi proxy node when it tries to do this command here the curl command to https localhost colon 6443 which will then forward it to one of the master nodes and if it didn't get any response within 10 seconds it assumes that particular health check run has failed okay and fall five means it has to fail five times consecutively in order to assume that the node is actually failed and rise to is it has to succeed two times consecutively in order to assume that the node is back online again weight minus two means when it fails consecutively for five times it's going to reduce the priority of the load balancer node by two so further down here you see i initially set the priority to hundred for both the load balancer 1 and load balancer 2 nodes state is backup so ideally one of the load balancer node will have the state as master and the other one will have the state as backup and you should have a high priority for one of the node and low priority for another node but to keep things simple i've said the priority is same for both load balancer one and load balancer two but when it starts for the first time it will identify which one to become at the master right so we have the virtual router id so it has to be a number between 1 and 255 the keypad id process communicates with the other keeper id process on the two load balancer it's kind of like a heartbeat so it periodically checks um every five seconds it checks whether the other node is available if it's not then the current node will become the master so that's how the virtual ip shifts between the two load balancer nodes let's copy this paste it on the two load balancer nodes right system cdl enable minus minus now keep on id it's going to enable the keeper id service as well as it's going to start the process let's do that right that's done and if i do system cdl status keep alive the people id process is running on both the nodes now all right so if i do journal ctl minus f to follow u for unit the service unit we are interested in is keep alive d and you can see here one of the node has become the master right so you can see here changing the effective priority from 100 to 98 that's because the script the health check script check api server that we created actually failed because the check we've actually done is to see whether the load balancer node can communicate to the api server using this address so this is basically running on the load balancer node on four four three we haven't got to the hatchet proxy configuration so basically in the hachi proxy configuration if we hit the port 6443 on the load balancer it's going to forward the traffic to one of the three master nodes okay so we haven't set up the kubernetes cluster yet we haven't set up the proxy configuration yet so this command is going to fail the kernel command so this health check script also checks another endpoint so it also checks whether we can reach the api server using the virtual ip which is 1 7 to 16 1600. so it checks whether the current load balancer node has the virtual ip if it has the virtual ip it also checks whether we can reach the api server through the virtual ip okay it's understandable right so this script failed on both these load balancer nodes the check api server failed and that's why it has reduced the priority from 100 to 98 so that is this weight -2 so whenever the script fails it reduces the priority by two basically so we don't have to worry too much about that so once we configure much a proxy and once we complete our kubernetes cluster initialization then it should all be okay but for now you can see load balancer one has become the master and load balancer 2 has become the backup and the other thing to note is you can look at sorry i forgot to mention there's one important thing this interface here eth1 is the interface where keeper id is going to attach the secondary ip address the virtual ip address so in my case i've used eth1 because all my virtual machines has two network interface let me show you ip address show you can see eth0 that's the default network and using my vagrant file i wanted i specified a custom network range so eth1 i always want to be using eth1 in your case you may not be using if you're using a physical server or a different virtualization if you're not following my favorite file you may have just one interface eth0s so in that case you need to update this interface to the right interface so in my case it's going to be eth1 so if i do ip address show eth1 you can see there's this virtual ip17216 1600 attached to the load balancer one so now the virtual ip is pointing to load balancer one at this point right let's continue setting up our kubernetes worker nodes so our hachi proxy sorry our keyboard id is complete so now we need to configure hachi proxy we've already installed the hachi proxy package and now we are going to add this one so basically we define a front end in high shape proxy so that's that's where we are actually listening so you can see here on the load balancer node we are listening for traffic on port six four four three the default backend is kubernetes backend which is defined here and all the traffic coming to 6443 will be routed to one of these three nodes in a round robin fashion whereas it yeah balance is round robin right so let's copy this hatchet proxy configuration paste it in both the load balancer nodes system serial enable hachi proxy hachi proxy should be enabled already by default if you are using ubuntu machine but if you are using centos the service might not be enabled by default but anyways we are going to enable it and then restart the hachi proxy service right so we have a proxy service running so now we can go ahead and configure all our kubernetes nodes right so there are some prerequisites that we need to be running we need to do on all of our kubernetes nodes both masternodes and worker nodes so for that i'm going to open up a new window and i'm going to open up four panes here so on the first one i'm going to log into 101 and in the second 102 which is k master 2 and in the third i'm going to log into 103 and in the 4 201 that is k worker 1 and i'm going to synchronize the pane password is cube admin right so we've logged into all our kubernetes nodes three master nodes and one worker node let's follow the documentation the first thing is disable swap swap off minus a and then we also disable it in the et cfs tab so that it doesn't turn on when we reboot all our nodes right so let's run this command all right that's done and then we disable the firewall right so next thing is we need to load some kernel modules couple of kernel modules overlay and uh vr net filter so this one will load the color module for this current session and we also put that in a file so that every time we reboot the machine the kernel module gets loaded so copy that [Music] that's done and we're going to add kernel settings against the ctl minus minus system will apply for this current session and we also put that in a file so that it gets added every time we reboot the system to make it permanent copy that right that's done and the interesting bit installing the container d run time so i've stopped using docker as the container runtime i for for a very long time i've been using container d as the container runtime let's do that so this one is going to take a minute or so all right container d installed and now we're going to add the kubernetes repository right kubernetes app repository added and now we're going to install kubernetes components qbm cubelet and cubectl version 1.22.0 okay so kubernetes components installed on all the nodes and now we are going to bootstrap our cluster so this the following command cubed init command we are going to be running just on one of the master node you can pick any node it doesn't really matter so i'm going to pick the first node k master one where we are going to initialize the cluster so once the cluster is initialized we will then add our other master nodes and worker nodes to the cluster so let me turn off the synchronization so i don't accidentally run this init command on all the nodes so now i'm just running the commands on a master one i'm going to make it full screen okay so we are on k master one and we are going to be running this command right cubed in it the extra option the difference here is because we are initializing a multi-master kubernetes cluster we are not going to specify a single master node instead we are specifying by adding this option control plane endpoint we are specifying that all our cluster components will communicate with the cluster using this control plane endpoint and which is nothing but the virtual ip17216 1600 which is at the moment attached to load balancer one on four six four four three and that will get that will proxy the request to one of the master nodes behind the load balancer and we also have this option api server advertise address again if you are just using one network interface you don't have to worry about passing this option in my case i've got two network interface and i want to make sure that i use the second network interface which has the address 172 16 16 101 4k master one if you're running this command on gamemaster 2 make sure to update this ip address and the pod network is see nine two one six eight zero zero slash 16. we'll be using calico as the overlay network so we need to set the part network sitter accordingly okay so this command will initialize the kubernetes cluster it will take a minute or so let me pause the video and come back when it's done okay cubedam init command completed and if you look at the last few lines of the output you can actually see the cluster join command that you will be using to join other master nodes and worker nodes so you can join any number of control plane node the master nodes by just running this command as root user cube adm join command and further down you also have the command that you need to run on the worker node to join the worker node to the cluster cell let's first join our master nodes okay so i'm gonna copy this command exit out of the full screen and i'm going to go to k master 2 make it full screen and i'm going to paste that so remember to add minus minus api server advertise address 172.16.16.102 this is sk master 2. and again you don't have to do this if you've got just one network interface this is just needed only on the master nodes because the master nodes are the one that runs the api server component okay so the command completed let me exit out of the full screen let me also clear the screen so now i'm going to run the same command i want k master three let me make it full screen and paste the command and add minus minus api server advertise address is 172.16.16.103 okay so the command completed let me exit out of the full screen clear the screen so now let me go back to k master one and copy the last command that i need to run on the worker node so copy that and then in k worker one let me make it full screen paste the command so here you don't have to pass in the api server advertise address option because this is not an api server so i'm just gonna hit enter okay so that has completed let me exit out and i'm going to go back to the master game monster one there is one last thing that we need to do we haven't deployed our overlay network cali curl which i'm going to do if i go back to the documentation so this is the command that i'm going to run project calico version 3.18 copy that and k master 1 you can actually run this in k master 1k master 2 or k master 3 doesn't really matter [Music] deploying the calico all right so that's all well and good so i'm going to exit out of k worker 1 and i'm also going to exit out of k master 3 right so i'm back on my host machine in the bottom pane here so now i'm going to copy the cube config file from one of the master node to my local machine so for that i'm going to make a directory dot cube under my home directory and i'm going to run an scp command root at one seven two sixteen sixteen dot 101 which is k master one you can just copy this edc kubernetes admin dot com file is going to be there on every master node so you can just basically copy it from any of the master node and copy to the dot cube directory that we created previously as config and the password is cube admin right so i've got q config copy to my local machine and now i can run cluster commands cube ctl cluster info yeah our cluster is good and you can see the cluster the interaction is through the virtual ip17216 1600 which is again pointing to load balancer one let's also look at cube ctl get notes yep we have three master and one worker everything is ready 1.22.0 if i do dash o y you can see we are running container d runtime version 1.5.2 ubuntu 2004 all right so now what i'm going to do is i'm going to run cube ctl get notes in a loop so it runs every two seconds and i'm going to show you what happens if we bring down one of the master node and things like that so basically we are now in the testing phase we have completed our entire setup we've got the load balancer plane set up with a proxy and keeper id which might be pointing to load balancer one and we've got the the multi master control plane with one worker node and we are running our commands using cubesat on my local machine all right so it's running every two seconds cube ctl get nodes and now because our control plane is highly available and it's being served through the load balancer now we shouldn't be having any problem bringing down any of the monster notes so let's uh simulate this so k master 2 i'm going to let's say power off and just to make sure i'm also going to do it because sometimes it hangs while shutting down the master node so if i go to the word manager the one that we shut down is k master 2 and i'm also going to force off so definitely k master 2 v2 machine is down and back in here soon you will see k master 2 status is going to be not ready so we've definitely brought down k master 2 and still we are able to access our cluster cube cdl get nodes command is running fine and there we go k master 2 is not ready let's bring this back up so let's start it and once the machine is up and it starts the cubelet service you will see k master 2 back online in service ready drugs of the traffic yeah k master 2 is now ready so that's fine and the next thing that we could actually do is if i log into 51 cube admin so that's our load balancer one and let's log in to load balancer 2 as well cube admin right so at the moment we are able to access our kubernetes cluster what if one of the load balancer node goes down so if i show you the ipad brush ath one and on this node as well right so as of now as i said the virtual ip is now pointing to 172 16 1600 is pointing to load balancer 1 and load balancer 2 doesn't have that ip address and all the traffic from our cube ctl from all the cluster components is going through load balancer 1 and that's being in a round robin fashion going to one of the master nodes behind it okay so let's say load balancer 1 is currently serving the traffic and i'm going to reboot this server right so that's gone and still your cube ctl get notes command is returning things because it's going through load balancer too okay let me show you what is actually happening and in here if i do ath one and you can already see the virtual ip switched to load balancer to 172 16 1600 so i rebooted load balancer one let's try and log into it cube admin ipa s address show shorthand for address show adh1 and we no longer have the virtual ip attached to load balancer one we were still able to access our kubernetes cluster let me also show you the journal log journal ctl keep on live d let's also do that from here journal cto keep alive and you can see here entering master state and it says entering backup state and the check api server script succeeded that's all looking fine so when we rebooted load balancer one the load balancer 2 became the master and that's how keeper id actually keeps track of which one it needs to make the master and which one it needs to make the backup all right let's do one more thing so what we just did was entirely bringing down one of the load balancer okay so that's one scenario and if i show you this configuration we had uh for keeper id so as i said keeper id has couple of things to check so one thing is it checks every five second the backup actually so in our case load balancer one is backup now load balancer two has got the virtual ip so that's the master so keyboard id process in load balancer one will check every five seconds that the load balancer to the keeper id process is running on load balancer two so that's the advertised interval five seconds and that's how load balancer two originally found that it didn't get any response back from load balancer one and keep a live id process switched load balancer 2 to master all right so that's fine that's this heartbeat check between the two load balancer nodes so what if the load balancer notes are actually online it's working fine but it is not able to forward the request to the uh backend kubernetes masternodes so our testing was to bring down the load balancer so that's obviously we need to take care of if one load balancer node goes down virtual ip switches to the other load balancer node so what if both load balancers are running fine but one of the load balancer has got problems accessing the kubernetes all right so let's simulate that so at the moment we know load balancer 2 has got the virtual ip address let's say for some reason load balancer 2 is up and running but it can't communicate with there's a problem with hachi proxy or it can't communicate with the the kubernetes master nodes so that's where the check script actually helps okay let's try this so i'm going to system ctl stop hatchet proxy so virtual ip is attached to this load balancer node and hachi proxy is the one that's forwarding traffic to the kubernetes master node and if i stop the hachi proxy service the load bouncer is actually running fine and this is the backup it knows the other one the master is actually working fine but it's not actually doing its job right so if i do that you will see temporarily there will be a loss of connection to the kubernetes api server but that's fine because it has to wait for five times to fail consecutively with an interval of three seconds so now this one load balancer to the check skipper id executes that check script in load balancer 2 every 3 seconds and it has to fail continuously consecutively for 5 times before it assumes that this node is actually down at which case the load balancer the virtual ip will get switched to load balancer one and you can already see we are back on track so now cubesatl get nodes command is running fine and if i do ipa s eth1 we no longer have the virtual ip attached to load balancer 2 and if i do the same here [Music] and 172 16 1600 is now attached to load balancer one so that's how it actually works and i think that's all i wanted to show you in this video so as i said you can obviously have more number of uh load balancer nodes so in my next video maybe we can try instead of having a proxy keeper id on a separate virtual machines we could try and integrate that inside the master components as static ports and see how it actually works all right so give it a try and let me know if you've got any questions i'll be happy to help and i will see you all in my next video until then keep learning and keep on learning bye

Info

Channel: Just me and Opensource

Views: 1,194

Rating: 5 out of 5

Keywords: just me and opensource, kubernetes ha, kubernetes high availability setup, set up ha kubernetes, highly available kubernetes setup ubuntu, kubernetes ha with keepalived, keepalived kubernetes ha, keepalived haproxy kubernetes ha, k8s high availability with keepalived, k8s ha basics, set up highly available kubernetes cluster, kubernetes ha setup step by step, step by step ha kubernetes setup, just me kubernetes ha, kubernetes cka ckad training, kubernetes beginners channel

Id: SueeqeioyKY

Channel Id: undefined

Length: 30min 49sec (1849 seconds)

Published: Sun Oct 03 2021