How to create a HA NFS Cluster using Pacemaker & DRBD on RHEL / AlmaLinux 9

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

this video is a guide to help you deploy a high availability two node NFS cluster on a local area network the tools we're going to be using to do this are drbd Pacemaker and corosync the information in this guide will apply to Red Hat Enterprise Linux 9 and other compatible Linux distributions the testing environment we will be using will be on Alma Linux 9 and drbd as packaged by linbid if you need access to drbd by linbit then you can contact the linbit sales team by email with sales linbid.com for those using rails some of the instructions will be slightly different but I will address those differences as we go once we set this up data transfers between a client system and an NFS share should not be interrupted by failure of a node in this video we are going to set up the NFS cluster simulate a node failure and verify that our data transfer continues before we get started in order to keep this video on topic and as concise as possible there are few prerequisites that will need to be set up you will need to have two nodes ready to go in this example they will be called node a and node B you will need to have lvm setup a drbd volume group external IPS crossover IPS and we'll need a virtual IP for services to run on we're also going to be opening some ports in our firewall and if you are using Rel then you will need to make sure your target systems are registered with red hat you can find the specifics of all these prerequisites in the companion PDF guide for this video to get started we're going to install drbd software from the official linbit repositories and register our two NFS cluster nodes to access those repositories to install drbd you'll need to have access to the linbit customer portal if you don't have access then be sure to contact the sales team at sales linbit.com the first thing we're going to do is grab the Lin bit configuration script from lenbit.com and run it on each of the nodes a quick note we'll be using sudo for most of this tutorial when you run the script it will prompt you for your Lin bit customer portal username and password once you authenticate the script will list clusters and nodes that already have been Associated here you can select the nodes you already have or create new nodes in a new cluster I already have some empty nodes to use but if you don't then you can type whatever number is next to the be first node of a new cluster option and then press enter then we're going to save this data to the registration.json file in the slash VAR slash lib slash drbd-support folder now it's time to enable the necessary repositories and in our case we're going to type 1 and press enter Then type 3 and press enter for Pacemaker and drbd9 once you're done selecting repos type 0 and press enter this will exit The Prompt and ask you if you would like to save the repo list file say yes and then repeat this process for each of the notes next we're going to enable the high availability packages if you are using Rel then the command for this will be slightly different than what I'll be using here in Alma Linux you can reference the PDF companion guide of this video for the specific command that applies to rail run sudo dnf config Dash manager dash dash set Dash enabled High availability and we're going to run this on both node a and node B now we'll install drbd the drbd kernel module and the pacemaker configuration system to do this run sudo dnf install drbd space KMOD drbd space PCS we're going to run this on both node a and node B also since pacemaker is going to be responsible for initiating the drbd service we will need to disable the systemd service for drbd to do this run sudo systemctl disable drbd and we're going to need to do this on both node a and node B now that we've installed drbd we'll need to create our resource configuration file and to do this create the file r0.res in the slash Etsy drbd.d folder in this tutorial I'm going to keep it simple by using Nano to set up these kinds of files but of course you can use whichever editor you prefer so copy and paste C configuration data into this new file save and exit back to the command line and we're going to need to do this on both node a and node B keep in mind this is a minimal configuration file there are many settings you can add but I'll put a link to the drbd user's guide in the description for those who would like to learn more information about that next we will use drbd ADM to create the resource metadata by running drbd ADM create Dash MD space r0 this command should complete without any warnings and I know you're aware about having a good backup strategy but if you get the messages about data being detected and then you choose to proceed you will lose that data so backups backups backups now let's bring up the device on both nodes by running drbd ADM up r0 and let's verify their states with drbd ADM status that's what's up whatever you want okay looks good we'll see that both nodes are connected and are set as secondary and inconsistent to have your data replicated we'll need to put the resource into a consistent State and to do this we have two options first we can do a full sync of the device which could take a bit of time depending on the size of the disk or we can skip the initial sync in the case of this video we're going to do the option two we're going to skip the sync because this is a new setup anyway so there's no data to sync on just node a we're going to run drbd80m new Dash current dash uuid dash dash clear Dash bitmap r0 0 and then check the status again to see if the disks now saying that they are up to date after drbd has been initialized we'll need to make node a primary by running drbd ADM primary r0 now we'll create a file system on the drbd device that we configured earlier and this only needs to be done on node a as drbd will take care of the replication to the other nodes so run mkfs.xfs slash Dev slash drbd 0. then let's create a new directory called drbd in the slash MNT or Mount folder on both node a and node B we're done with the initial setup of drbd device so it's time to install Pacemaker and corosync we will be using pacemaker as our cluster resource manager or CRM and corosync will act as a messaging layer providing information to Pacemaker about the state of our cluster nodes run the dnf install command for Pacemaker and corosync and then let's enable the services in systemd next we are going to create the configuration file for corosync using Nano so copy and paste the data in the PDF companion to the slash Etsy corosync folder and once that's done let's go ahead and start the coral sync and pacemaker services in systemd now repeat this process for installing and configuring Pacemaker and corosync on each cluster node then we can verify that everything has been started and is working correctly by using the CRM underscore mon command in this example we're going to be using two nodes so we'll need to tell pacemaker to ignore Quorum as that would only be used with three or more nodes we'll do that by setting the no Quorum property with the PCS command we will also be disabling stoneth since we will not be configuring node level fencing in this guide to do that set the stoneth enabled property to false with PCS and we're going to do this only on node a as this will be implemented throughout pacemaker now that we have initialized our cluster we can begin configuring pacemaker to manage our resources in this demo we will be creating our configuration file and apply the changes directly to a live cluster configuration because it's a demo but in a production environment this may be risky to do instead you may wish to work with CRM underscore Shadow configurations first drbd is the First Resource we will configure in Pacemaker and we're going to do this only on node a as this will be implemented throughout pacemaker we will create the configuration file by running PCS cluster CIB drbd conf this will pull a working version of the cluster information base or CIB then we'll configure drbd primitive P underscore drbd underscore r0 and the primary and secondary set for the drbd resource using PCS Dash F then we'll verify and commit the configuration changes using PCS cluster CIB Dash push you will find the specific configuration commands in the PDF companion document Linked In the description now if we run CRM underscore mon to view the cluster State we'll see that pacemaker is managing our drbd device with drbd running we can now configure our file system within pacemaker we'll need to configure co-location and Order constraints to ensure that the file system is mounted where drbd is primary and only after drbd has been promoted to Primary in our case we're going to create the file system on node a then using pcs-f drbd comp constraint we'll set the order and then the co-location to infinity and beyond wait that's not it that's okay though you'll find the specific commands to run in the PDF companion document now let's apply our changes with PCS cluster CIB Dash push and verify with CRM underscore mon you can also run the mount command and you should see the drbd device mounted at slash MNT drbd or slash mount drbd now let's configure our NFS server and the exported file system we'll need to install NFS utils and RPC bind on both of our nodes to do this then make sure that RPC bind is set to start at boot on both nodes by running system CTL enable RPC bind dash dash now by the way I wasn't aware of the dash dash now part for longer than I want to admit so for those who aren't familiar it will enable and start a service in one command rather than using the enable and start commands separately all right since the file system needs to be mounted locally before the NFS server can export it to the clients on the network we will need to set the appropriate ordering and co-locations constraints for our resources NFS will start on the Node where the file system is mounted and only after it's been mounted then the export FS resources can start on that same node we will need to define the Primitive for the NFS server the NFS server requires a directory to store its special files and this needs to be placed on our drbd device since it needs to be present where and when the NFS server starts to allow for smooth failover we are going to run the following commands on just one of the nodes we'll use pcs-f drbd comp resource create to create the configuration file and then pcs-f drbd conf constraint to set the order and code location after that we'll run PCS cluster CIB Dash push to apply the changes and use CRM underscore mon to check the status after a few seconds you should see the NFS server resource start with the NFS server running we can create and configure our exports you will do this from whichever node has P underscore NFS server started in your cluster the first thing we're going to do is make a new directory with mkdir-p Slash MNT slash drbd exports slash dir1 then use the Chone command to set the user and group to nobody then create a new resource using PCS Dash F drbd comp following that up with pcs-f drbd comp constraint to set the order and co-location then again run PCS cluster CIB Dash push to apply the changes like before the specific commands for this can be found in the PDF companion document you should now be able to use the show Mount command from a client system to see the exported directories on the current primary node now let's configure the virtual IP we'll do this because the virtual IP will provide consistent client access to the NFS export if one of our cluster nodes should fail to do this we will need to define a co-location constraint so that the VIP resource will always start on the Node where the NFS export is currently active once again we'll run pcs-f drbdcomp resource create to set up the resource and then PCS dash-fdrbd comp constraint to set the order and co-location and as you might expect we'll then run PCS cluster CIB Dash push to apply our changes you're probably used to it at this point but the specific commands can be found in the PDF companion document in the description now we should be able to use the show Mount command from a client system specifying the virtual IP and see the same output as we saw before all right great we're done configuring the setup so now let's test it out there are many ways we could test the Persistence of our cluster's NFS export in a failover but for this video we're going to do it with the handy dandy DD command so we're going to create a large file in the mounted export directory using DD while failing over we're going to use our secondary node to mount the exported file system and then use DD to create a one gigabyte file named write underscore test dot out and before that command completes on the client node we're going to reboot the primary node by switching to root and running Echo B greater than slash proc CIS RQ Dash trigger the node should reboot immediately causing the cluster to migrate all the services to the peer node the failover should not interrupt the DD command on the client system and the command should complete the file without error we can also verify our fell over by running drbd ADM on the Node and see that the other node has been promoted to the primary role there we go you've successfully set up a high availability NFS cluster using drbd corosync and pacemaker there's plenty more you can do with this setup like configuring additional exports or testing the failover by playing some music on one of the nodes we'll leave that stuff up to you to explore I will say though I did want to do the music test on here but there's the whole copyright stuff on YouTube so I settled for DD if you have any questions regarding setting up a high availability NFS cluster in your environment you can contact the experts at linbit I put a link in the description I hope you found this video helpful and if you did please smash that like button and be sure to subscribe to the linbit YouTube channel if you'd like to get more tutorials like this one that's it for this video and I'll see you in just a Lin bit holding it for a long time not awkward

Info

Channel: LINBIT

Views: 9,770

Rating: undefined out of 5

Keywords: high-availability, NFS cluster, DRBD, Pacemaker, Corosync, Linux, open source, free software, Red Hat, RHEL, Red Hat Enterprise Linux 9, Red Hat Enterprise Linux, RHEL 9, Linux distro, AlmaLinux 9, AlmaLinux, LINBIT, NFS share, virtual IP, LVM setup, nano editor, cluster resource manager, systemd, STONITH, NFS

Id: IxFI0Ms0ULA

Channel Id: undefined

Length: 16min 29sec (989 seconds)

Published: Tue Oct 25 2022