Setting up Proxmox CLUSTER and STORAGE (Local, ZFS, NFS, CEPH) | Proxmox Home Server Series

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone and thank you much for watching this is me Mr P and welcome back to another episode of a prox mo Home Server series or should I say prox mo cluster Home sery Series as in this video we're going to crank it to 11 and we're going to set up proxo cluster before going any further I want to mention three things number one the best practice to have proximos cluster is with three nodes I have node one node two and note three ready for this video but you can have prox Mo cluster running with two notes and you can have proxos cluster running with two nose plus Q device number one proxos cluster with two nose will function you will be able to migrate VM and containers between two nodes it's not going to be as fast as if you would have free node High availability proximos cluster with two nodes proximos cluster yes you can moate VMS and XC containers but if one of the node dies VMS that were hosted on that node will be offline the proximos cluster will not automatically migrate lxc containers in the M to a functioning node with two nodes and Q device the high availability is you can set up high availability as Q device which can be anything even a Raspberry Pi will act as a voting the way that approximate High availability works is that each node receives a vote so it can vote to for each other who will take over if one of the node dies so Q device will act as like a like a referee who will just give a no give a vote to remaining on line node to take over so of two nodes plus Q device you can set up high availability but if a q device dies and one of the node dies as well basically your proximal cluster becomes dead no VMS will function and you can't do any migrations so it's recommended to have free node cluster number two is that you don't need to have free nodes exactly same specification in my case I have all three nodes with the same amount of CPU RAM storage and each of them has exactly the same amount of hard drive attached to it but I can easily create a proxos cluster with just one Noe having four CPUs another node having 8 CPUs one node can have 4 GB of RAM another node can have 64 GB of RAM and the PLM clust cluster will function but the problem that I will face that if for example my Note 3 will have 64 GB of RAM and it will run 12 VMS and if that no dies proximos High availability will push all these VMS to another node with Just 4 GB of RAM and 4 GB of RAM is not enough to fully host 12 BMS so it's a deal and it's recommended to have all three noes to be exactly the same specification but if one note has 8 GB of RAM and 11 16 you need to play around which which VM to will land to which node once the one of the node dies inside the high availability cluster and number three before you set up the proximo cluster you need to note that only one node is allowed to have VMS and Le containers during the setup process in my case for example if a PVE 1 will have two VMS running here and pv2 will have two VMS running here I won't be able to set up proxos cluster because during the setup process proxos will moan at me that one of the nodes already hosting the already hosting the mxc containers ideally you would like to completely wipe all these three noes do a fresh install of the proxmox with no VMS and LX container is running and then start setting up the cluster but if you have one node already with the VMS and other two are hosting nothing basically they are blank the one with VMS needs to be initiator needs to be the one that is going to start setting up the proximal cluster so let's start setting up the cluster with my free noes I will pick pv1 as my initiator so I will click on a Data Center and inside cluster option I will have two options to click create a cluster and join a cluster because there is no cluster created yet I need to create create I need to press create a cluster and I will click and I will name this cluster Mr P channel channel here we go and cluster name is given and then I automatically picks up my Nick my IP address that will be used to synchronize all the nodes inside this cluster if your node has more than one Nick it's going to show you more IP addresses in here so I'm going to say everything is fine I'm going to click create uh oh yeah because it's not allowing me to do this let's do Dash instead of underscore so right now proxmox is creating the cluster for me and once the creation is done I will get a notification saying that the creation is completed and then I will be able to copy the join token here we go task is done I'm going to exit that out and now I can see that there is a cluster created Mr P Channel config version one number of nodes one and this is the node PVE D1 ID number of this of this node is one and the vote this node received one vote so now I can click join information and it's going to give me automatically generated join information which I need to just click on this button and will copy all this text block this text box contains everything that is required for another node to join the cluster I will click on the Node two and the data Cent I'm going to click cluster and instead of selecting create cluster I'm going to click join cluster and here I will paste that command of all that text Block it's automatically prefilled all the information and now I need to enter the peers root password and this is the password the root password for my node one so I'm just going to go here and enter password for my node one and click join Mr P Channel cluster and this is joining now and it might feel like this is actually becomes offline so I can I can actually close that go to here and I can see that under my node one tab with the I know node one IP address right now I can see node one and node 2 showing up this gives me an error that is just a there is no connection to pv2 because it's actually doing all these Connections in the background I'm going to refresh this and here we go I have PVE 1 and PVE 2 in my proxmox cluster before I go and join node 3 I'm going to show you what I was what I meant saying about high availability if we're going to click on a data center for this cluster I'm going to scroll down in this option list and click on ha which stands for high availability if I click on that right now it says Quorum is okay which is fine but if I'm going to start creating VMS I'm going to start setting up availabilities high availability VMS I will start getting messages in and notification messages error messages displaying that a quorum is not okay because only two nodes available and a fully functioning Quorum requires three nodes so now I'm going to go back and back to note three data center cluster join paste this same text block enter the password root password not for node two we're still using a root password for node one and password is in I'm going to say join Mr P Channel and again this might look like it's going offline I can actually close that and I can see in the node one page right now I have PBE fre which is node fre showing up I we'll just refresh that uh let let's refresh again and here we go it's going from a question mark and it's going to be a green tick in a minute here we go now I have free noes cluster I'm going to click on a Data Center and under the summary I can see that there is a free noes available no VMS and lxc containers running and if scroll down I will gives me a resources so combined between free noes I have 12 CPUs I have almost 12 GB of RAM and I have 100 175 plus gigaby of storage and it gives me the information per each note that shows me up time memory usage and CPU usage and their IP address next we're going to talk about About Storage first storage setup I want to show you is called local basically you installed proxmox OS inside one drive and that drive is used to run proximos OS host VMS lxc containers and store VMS Drive vm's templates and ISO files as you can see VM VM disc located on this side and if I go to node 2 V CT volume is located this side so simple setup all three no has one storage that is used for everything so first of all let's set up high availability for example on this VM if I click on a 100 ID number 100 I will click on more and I can choose manage ha here or I can click on a data center scroll down select the H and I can add it here as you can see it's detected that I have three noes Quorum is okay and the master is been picked up as a pvu if one of the nodes dies other two will start voting and the MS should automatically migrate but with local diss been used for storage and everything else that will not work and I will show you why so if right now I'm going to go back in a data center click on the ha I will click add and I want my Alpine VM to be high available VM I'm going to click on here you can create a groups I will show you in a second how that works so that's it maximum restart is one and maximum relocation is one 100 ID number 100 I want a high avilability I'm going to click yes and now that's it VM 100 is been added as a high available and it's being queued with the state so technically if the node one will crash node 2 should pick up the pick up the pace and get the node one ID number 100 VM running but it will not work because local dis has no idea that this drive or hard drive for this VM actually exists if I click on a node one under the VM diss as you can see 6.44 GB drives exist here but it doesn't exist here so if this node suddenly goes offline and proxo cluster will start oh okay this node is died we need to migrate 100 to this node or this node but wait a minute they don't have anything here so the is a option for that to function if I click on a node one ID number 100 VM I'll click on replication and I'll click add so I want this node or this VM replicated to pv2 every minute it but if I click create it's going to give me an error that this is not going to work on the local storage drive because local storage is just stores data here and you can't create high available High availability VMS and Lexi containers yes you can migrate stuff between like for example if I click on this LXE container I will right click on this choose migrate or I can go and click on this button and I'll say Okay I want from PVE 2 to PVE 3 restart mode means that the lxc container will shut it down migrate this data and then start all start on start from basically called it will restart let's click create and what proxl is doing now is shutting down the Galaxy container 101 and is creating the drive inside node 3 with exact same data that was on a node 2 for that lxt container to function and is migrating every single block from Note 2 to Note 3 it will take time depending on the Alex container Drive size and VM Drive size as you can see it's right now going up to I think this is 6 GB drive so it's just right now reaching 4.5 it's going to be almost finished in a minut in a in a second and here we are this actually been migrated I 36 GB is actually 8 GB and it's completed in I do believe 45 seconds so this in 46.8 seconds it's migrated the LXE container from node 2 to note three so right now note three if I click on a local under CT volumes it knows that this drive exists and if I click on a pv2 local C volumes the drive doesn't exist here because proximos just take took everything and migrated all the way to Note free so just using a local as a storage High availability is not not going to work next storage option is ZFS and right now what we're going to do we're going to quickly create a ZFS pool in each of these nodes and we're going to start creating High availability and replication I'll click on a PVE one and if I scroll down on the discs I have the 132 GB Drive attached to it so I can go to ZFS create ZFS and add the name ZFS on the first node you need to leave add storage checked if on other node two and node three you need to leave unchecked you need to leave tick tick for ZFS at storage on one node to allow proximos cluster properly utilize ZFS pool between all the nodes single drive that's fine compression or lz4 is exactly the same by default proxmox using uses lz4 for ZFS so just leave that at on and 12 is fine by default and I'm just going to say okay let's create node one is creating ZFS for me and that is done so I have ZFS running it's online on zfa on Noe 2 I need to go and click ZFS I need to give the same name as I gave for node one untick this select everything so basically just one drive and everything else by default and click create now we need to do the same thing on node 3 so now I have all three nodes having ZFS created but only ZFS showing on a node one to fix that I need to click on a data center storage select ZFS edit and inside this dropdown under noes I need to select two and free that noes can utilize this ZFS pool and click okay and now in a a second or so all the notes will pick up the ZFS storage so next thing what we're going to do I'm just going to go click on Alpine VM on the hardware I will select the local storage 6 GB disk action move storage and I will select ZFS and delete the source and click move so proxo right now relocating this drive from a local storage into ZFS pool drive has been successfully migrated in 22 seconds if I close that right now the hard drive located at ZFS so now I can go and click on a replication as we tried to do before with the local storage but right now we're running ZFS for this virtual machine I'll click add and we'll say this VM can replicate to pv2 every minute you can start from a minute and you can go all the way to basically first day of each year so once a year depending on your VM size you might need to increase this this to five minutes or 15 minutes but that means that the drive that will be replicated to node 2 will be 15 minutes behind what is actually node one for this demo I'm just going to leave one because I know this is a think is 6 GB or 8 GB storage size of the drive is not going to take that long and I'm going to leave this enabled I'm going to click create and Now prox setting up the next sync for me to synchronize the drive from this into node 2 if I click on here right now it's showing nothing because synchronization hasn't started yet if I click on Alpine VM let's see okay synchronization is happening I can see a loading wheel and let's wait for this to finish and here we are we have this completed the first sync happened in 21 second and the other every other synchronization will start happening much faster because only changes will be synced across so now if I click on a node two ZFS I do have VM disc showing up here for VM 100 and the same one is available in the Note 2 in node one sorry so now if I click on the VM I'm going to say I want to add another replication I want replicate this to node three everyone every minute and click add so right now proxmox cluster will replicate virtual machine 100 disk from node one to node two and node 3 every 1 minute and here we are replication to node 3 finished in 16 seconds and the node two received replication in 4.5 seconds no no longer 21 seconds required it's just because was on the change has been synced so right now if I'm going to right click on this VM I'm going to say migrate to pv2 I'll get an error that's saying that you can migrate with the local disk please note that every time you're migrating BM you need to detach everything like ISO files and everything else that's located inside local storage so viso drive I'm just going to remove that has been removed I need to shut down the VM to take the action so I turn VM off and turn back on and right now ISO file no longer ex exists so right now I can click on this Alpine BM and do migrate to pv2 it's telling me that the migration will take uh this might take long because 6 gab migration but they already have the actual replication happening so I'm just going to go and click migrate so request to high availability migration of FM 100 to NO2 task is done so right now prox Mo's High availability code behind the scenes will do all the migration for me if I go to the actual logs I can see the migration is happening double click on that and let's have a look so migration started uh let's go have a look it's already jumped to Noe two I'm just trying to quickly look how long it took for all this to happen in 21 second so 21 second took to migrate to VM from node one to node two I can migrate right now from node 2 to node 3 as a replication being set up so what if one of the node dies for example if I'm going to go and right now Force the node two to become offline let's see what's going to happen so node two I'm going to tell my system that node two is Right Now offline so node two is brought to offline mode and in a matter of seconds the icon the green icon will change from green to Red indicating that node is possibly dead here we go this is changed if I click on a Data Center and I scroll down to H in here I will start seeing information that no two possibly dead let's wait for this message to show up here we are we have pv2 and the brackets it says all timestamp possibly dead and voting happens in the background and right now it's going to be decision made who will take node uh VM 100 to where and after quite a long time the actual VM finally got migrated to node one so node 2 is offline prox Mo cluster decides where the M will go and it's went to node one prox Mo cluster decided which node will take 100 and start hosting it but you can restrict that a bit more if you click on a data center scroll down below the ha one of the sub options is groups and in here you can select create the high availability groups so for example I want the Alpine VM 100 to be run hosted only by node 2 and node 3 so if I click on a create I'll give it the name Alpine and select node 2 and node 3 I can select a priority for example I know that node 3 might have more RAM and CPU power to run Alpine so I'll say this is a priority one and pv2 has a bit of less RAM and CPU power I'm going to say is two that means that if node fre dies the VM will be hosted on node 2 but as soon as node free is back online BM will be remigrated back to node fre if I'll select restrict it means that BM can only run on these no on these noes it's never going to end up inside the pv1 and if I select no failback that means that if note three dies VM will be migrated to Note 2 but once Note 3 is up VM won't won't be migrated back to node 3 so select like that I'm going to say great so right now Alpine Group and inside Alpine Group we have node 2 and node three and node two is priority 2 and node three is Priority One so first in line second in line now under ha I can click on the VM click edit and I say add to a group yes so right now I added that in so VM is right now running on PVE 1 but group is insisting that VM needs to run on PVE 2 or pv3 as you can see status migrate started proxos realized that okay this VM is has a group assigned and there is a special priority which node can run what so technically and I hope that's what's going to happen is node uh the PV will be run to this to Noe two I think let me double check am I messed up messed this up sorry yeah two is bigger than one so yeah priority is other way around that I told you so two if that was three like that so right now prox should decide that okay wait a minute they note three has ID uh points fre no two has points two so migration happens and goes in so my apologies yeah the bigger the number the higher priority the node will takes over by migration so proxmox is migrating right now VM from pv2 and second it will migrate on pb3 and as you can see it's landed here so migration happens but it's still not not not best not ideal because first of all it takes a bit of time to migrate between ZFS pools between the notes and another problem is that replication if this Alpine BM right now with only 6 GB of storage if that was 200 GB the actual replication will take forever to replicate and it's not going to be ideal it needs to be much faster and much better way the other way how you can get this working is with NFS I do have NFS share ready for this demo so let's let's get that working we going to click on a data center storage add choose NFS and let's fill the details so I'll give a name NFS server We'll add the number IP address of my NFS server and if I click on this is going to give me option what I want to mount I'm going to say demo cluster Galaxy cluster is my production one so I'm going to say demo cluster and now I can choose what this um attached storage will store I can choose everything so that's what I'm going to do I'm going to say yeah I can choose everything and store in here note who will be able to use this pv1 PVE 2 and pv3 will be able to use this I'm going to click add and here we are an FIA has been added I can see this showing up under pv1 PVE 2 and pv3 pv2 and pv3 still trying to decide if it's going to show up or not let's do a quick refresh and here we are all free nodes has access to the same anici so right now if I'm going to go to Alpine BM and the hardware I'm going to say I want to move from local disk the BM disc I want to move from local storage into NFS so I'm going to say NFS and move the disc and while this is happening I want to pointed out one thing I'm going to click on a data center storage under NFS I do have option called shared and that means that once single storage unit will be accessible via all the noes at the same time so while the Alpine VM is getting migrated actually it's already moved I'm going to click on NFS VM discs I do have access to this Drive via pv3 same I do have access to this Drive via PVE 2 and I do have access to a same drive by pv1 so right now migration is as simple as a right click migrate and tell where to go and migration will start happening much faster than for example of ZFS and here we are output logs for my migration of BM 100 which completed in 8 seconds with ZFS that took around 20 22 seconds to complete but when you take in account that this VM is only 6 GB in size it's going to take uh much F longer to migrate via ZFS when you have a bigger size disc where on this side on this is just no problem at all and Alpine actually jumped back to pv3 for the reason that if I'm going to go to Data Center I'll choose ha and it's linked to Alpine Group which is forcing VM to only run on pv3 and pv2 so every time when I'm trying to MGR this VM to PVE 1 like I'm going to do right now again as I done before so VM will go straight away into PVE 1 migration happening and migration will be completed in show me the time 8 seconds but straight away PVE proximos cluster ha High availability will decide where the minute this is the wrong note so it's automatic will start migrating the M back to pv2 or pv3 depending on our priority and pv3 has a higher number so it's going to land back to pd3 and that will happen in 8 seconds as well so NFS works great but with NFS there is a problem which is called single point of failure which which means that if my NFS or my samb my my Nas dies NFS connection stops working something happens to the Nas and that's it NFS does doesn't work anymore every single VM that is being um moved to have a storage as NFS will St functioning so if each of these nodes will have five VMS um in total between them sorry if every every node will have 5 VM each and the all VMS will be mounted to NFS all of them will be stopped because NFS stopped so NFS works much faster than ZFS but it brings single point of failure which is not ideal the best way the best solution for the making sure that you have high availability VMS LX containers and storage is to use SEF SEF is an open-source software defined storage system that provides scalable and Rel reliable storage solution for managing data across computer cluster in a nutshell every single node inside your proxmox cluster will be responsible to make sure that SEF is in sync so all the drives will be in sync and your pool will be healthy things to mention before going into a SEF setup uh first of all I'm not the expert of SEF but I've been using SEF setup for my on my main proxmox server for about two months now and uh I broke it so many times that I don't really want to admit how many times I did that but I think I'm I am at position where I can give you understanding what is how SEF can be set up on your procm and in upcoming videos I will give you a demo what you need to do if things goes bad and next thing what I want to mention that each node must have a separate drive for SEF to fully function in my case each node has additional 32 GB drives attached 64 GB is been used for OS and everything else an additional 52 GB Drive is at the moment blank on each of the notes as you can see Note 2 has the same size drive and Note 3 has the same size drive next thing is that you don't need to have same amount of drives on each node for SEF to work you can have two drives on one node and one drive on another node as long as the drive size are somewhere around the same what I mean by that is that if you have two nodes with one terabyte drive each and third node with 256 GB Drive you're going to have a bad time setting up SEF well SEF will function but not at the full capacity ideally all regarding with the storage for St all the drives needs to be best scenario same size or very close so let's start setting up SE I'm going to click on pv1 cuz I want this to be my initiate of the SEF so I'm going to click on pv1 and inside the options list I'll click on SEF and it will give me a message saying the SEF is not installed of this node would you like to install it I'm going to say yes and I will pick the most recent version reef 18.2 and from Enterprise repository I will choose not sub no subscription and I'll click Start Reef installation it's going to open the command line and it's going to give me an option to accept installation I'm going to say Y and enter and right now I leave this installation going it's will take about minute minute and a half up to 2 minutes for to get everything installed depending how fast your proxmox node is how many RAM CPU and everything else this depends on a lot of factors how fast it's going to be installed so I'm going to leave this going and I'll come back when this is done and here we are SEF is installed I'm going to click next and now I'm going need to start setting it up public network for SEF I'm going to choose the one network Nick that's available for me if you have more than one I suggest to choose the other one not the main one that access the internet but the other one want to make sure that all the traffic that will happen between thef cluster nodes which is going to be PV 1 2 and three will happen on another neck so the the less congestion connection is going to be much better but because I have the my setup is only one Nick I'm going to choose that cluster Network same as public that's fine numbers of replicas and minimum replicas number of replicas is represents how many nodes will be included in SEF pool in my case it's free nose so I'm going to choose free and for SEF to function properly at least minimum of two needs to be online for SE Stills to function while you're sorting the third one so everything else from your neck down is just going to be default click next and that's it installation successful and it gives you basic information what you need to do next you need to install SEF on other nodes create additional SEF monitors create SEF osds and create SEF pool so we're going to click next and right now under SE I have a this page instead of showing me that it's not installed I have this page going and so F tells me that I have one monitor one manager no information about read and write performance and no osds in on out so I'm going to click on pv2 and Seth it gives me option to install Seth so I'm going to follow the same procedure re re as a verion non subscription install it and it's going to give me option to accept I'm going to say why and yes and wait for this to installation to finish and here we are we have Seth installed on the Node 2 I'm going to click next and right now it gives me information that configuration already initialized once you setting up the proxmox cluster there's a special file that's getting synced between every single node inside the cluster and that file sorry not that that not a file folder and that folder contains right now SE configurations that's how the node 2 knows that configuration is already been initialized because it's access that sync folder knows that the file is already been created so I'm just going to click next and finish and now I need to do exactly the same thing on PV free but before doing that as you can see there's a warning saying that OSD count is actually zero where supposed to be default is free OSD it means drive is not assigned yet so we're going to do that as soon as I get the note three up and running so click on Note free install Seth and use do exactly same procedure as I done on the PVE 2 and here we are we have Seth installed on a Note 3 so I'm going to click next same situation here Note 3 knows that the SEF configuration already initialized so I'm going to click next and now what we need to do we completed point one so now we need to go and set up SEF monitors then always these and then the pool if I click on any node and I'll go to option SEF I'll see all the information showing up here and I can click on a data center then SEF and I'll see the same information here so now let's start setting up monitors managers and then usds and then a pool we're going to click on pv1 or pv2 doesn't matter click on a SE and then drop down if you can see that you click on that triangle to reveal more I need to add configuration we'll show you all the configuration about SEF I highly recommend do not touch anything in here otherwise SE will get broken under the monitor I need to add monitors imagine monitors as the U security guards they will keep an eye on the self and make sure that everything is sticking nicely so I'm going to add monitor 2 that's been added and I add Monitor pv3 and here we are I have three monitors running one is still showing stopped it's going to start running in a second here we go so we have three monitors running manager is imagine this like a overall boss he knows what how data needs to be synced and if the if SEF pool is healthy one monitor is fine but better is to have two so I'm going to add pv2 as a monitor to as a manager to and right now as you can see pv1 it says active and pv2 is in standby that means that it's ready to take over manage a position if pv1 becomes offline but right now pv3 is unhappy so we're going to go and add pv3 to be a manage as well so right now PVE one is a manager PVE 2 on standby same as pv3 is going to show up here on the standby ready to take over so if PVE one crashes becomes offline either of these two notes will take over as a manager and will start checking if SEF pool or SE SEF data cluster is healthy let's click on osds imagine SDS as a as a checkpoints you create OSD OSD is representing the drive so you create the checkpoints where SEF will store data and make sure that the data is even across entire SEF pool I will start doing osds from pv1 because when you create OSD ID number will be assigned OSD ID number starts from zero so PVE one will have OSD ID zero pv2 will have OSD id1 and then pv3 will have OSD id2 so like to have everything in numbering order so pb1 SEF OSD create OSD it's automatically will pick up the available drive so I'm going to say yeah that's fine I will not take encrypted OSD in your case you might need to have encrypted OSD I will not keep that I will not take that dry device class you can choose you can leave it as autod detect I'm just going to choose manual it is HDD is more for logs and how SEF will treat each Drive is no like super speed penalties depending which one you choose I will choose HDD if you have two drives attached to a node one is hard drive and one is for example SSD on vme you can start setting up like a cache drives to make sure the SEF or to make SEF pool read writes much faster as I only have one drive I'm going to choose HDD create right now SEF is creating my first OSD let's wait for this to finish and once it's done it's not going to show up straight to M just need to click reload and reload again again and here we are pv1 has OSD zero it's a hard drive file store and currently is in but it's down that means that the Sur is is initializing this drive so while this is happening I can click on pv2 as you can see seeing exactly the same information a self configuration is being shared across each node so on pv2 create a USD drive automatically selected I'll just choose that this is his hard drive I'm going to click create so right now I have two OSD drives SCP I'm going to click on pv3 and do exactly the same thing create USD Drive is automatically selected it's a hard drive and click create and here we are all my free OSD drives are added osd2 belongs to pv3 still in the initializing process one thing to point out that let's say each OSD is one terab in size this doesn't make that you suddenly magically have um three terabytes of storage you have still one terab of storage which is being spread out across free drives and they're all in sync if I right now click on end node and then click on a SF I'll see that this right now saying state is healthy I have up fre that is good sign and the pgs um is active and clean I'll explain to you what pgs means in a second but basically that shows that the the SEF Drive SE osds are clean then we have three monitors they all tick and we have three managers they all green tick and right now it says that total combination is 95 GB of storage but again it's just 32 GB three times so don't treat that as a is actually I have almost 100 GB of storage so right now I have U monitors manager created I have alwaysdc created SEF FS I never touched this so I'm not really sure what exactly this does I'm going to click straight away on pool so under pools I'm going to click create a pool and now I will give it a name I'm going to call it SEF size is free that means it's spread across fre drives and two drives needs to be at minimum online for SEF to fully function then Crush rules a replicated rule so it's going to replicate data across all the drives and number of pgs So currently set to 128 imagine pgs as a token for example if you uploading a data one terabyte one sorry one gigabyte of data you're uploading to this pool and for SEF to properly replicate this drive across every single drive it will ask 20 tokens so you will end up 100 and eight and let's say you the DM runs and is constantly reading and writing data to SEF SEF requires 50 tokens to make sure that everything is running fine ideal sweet sport for home users to start is 128 and you can see here PG Auto scale mode on that means suddenly if SEF runs out of all these tokens it can automatically scale up on my main production on my main proxo server I started with 32 and it scaled up to 64 and right now happly sits there this number might go up and down during the initial VM process initial VM setup LC setup when you're doing a lot of stuff inside the prox MOX especially where the data is stored on SEF but once everything is settles down this number will settle down to comfortable number for SE to fully function and rest of the rest of the stuff I'll leave at default I never changed any of this information inside my prox so I'm not going to give you I don't know exactly what they do make sure add to storage is Tick that's it SE fre to replication rules 128 click create SE pools has been created and right now as you can see showing up under node three under node two and it shows up under node one so all these nodes right now have access to a single SE pool as you can see 62 gabes in size so let's move VM to this self pool I'm going to click on a hardware under VM hardware and from local I'm going to click on dis action and move to the self storage and say move so while this is happening if I exit this out click on any node and then click on a SE I can see read and write performance so as you can see right now a lot of things happening and this is where PG is getting used so pairing active cleaning and Etc pairing and Etc is just going to go up and down constantly and if I go down I can see that information is happening uh there is a writing at 12 megabits megabits per second and then 6.3 kilobit read so SEF is right now getting utilized because the data is being moved to SEF pool so let's see how it's performing and the 31 second in 31 second the data was moved I'm going to do the same thing with Ubuntu you need to shut down LX container before you move the disc you don't need to do that with VMS so right now LC container is getting shut down I'm going to click on this wait for this to happen let's see if I can do that now SEF and migrate and right now LX container disc getting migrated 32 I can go and monitor self performance redund and PG inactive PGC that's it it is detected that might be the inefficient inefficient way of what I mean to say is it's not enough pgs available for data to be migrated so he doing all the necessary configurations to make that happen so right now in a matter of seconds I think it's almost done I will get the LXE container disc migrated to SEF and here we are task okay Alex container drive has been moved so I can go and start that running from SEF pool so container starts stra away you can monitor how it's performing 100 pairing is one read and WR so 4% of data is been used stuff is getting read and write going that's it so I have NX container and I have VM running in within the the M disc and LC container disc is stored in SE so let's migrate the the VM to another note I'm going to migrate to pv1 I'm going to say go for it migrate is happening so let's go and check how it's performing migration is being initialized and migration is happening with NFS it took I believe was 8 seconds so let's see how fast that's going to happen here it took 9 seconds this is pretty much what I expected like I said I've been using SEF on my main prox server for about two months and before I settled to Usef I've been using um NFS for about a month and a half and I done a lot of testing with between NFS and and SEF performance and for my use case NFS and SEF performs pretty much exactly same but like I mentioned NFS brings you a single point of failure and it happened twice when my Nas decided to restart because of update and on another occasion the nas just completely basically went shut down and turned it off and that meant that every single VM all my py hole home assistant all my files all my paperless documents everything went offline because all the drives were stored inside NFS the speed wise for my use case NFS and SEF performs the same but with SEF right now I have three points of failure what that basically means that SEF data is being spread across three nose either these nose dies the actual Alpine VM for example will be automatically migrated we can actually go and test this I'm going to go and make sure that my note three is Right Now offline and because SEF is been shared across all three notes is is bite B bite zeros and ones perfect migrated between them so as soon as this is become stops functioning Seth knows exactly when the cut off happened on the drive VM gets migrated and starts exactly where it left let's check how proximo H performing this is dead and right now it says that it needs to go U it's going to detect that is is dead it's conf it will confirm that it's dead and it's going to migrate right now depending on a group policy and a group policy is means between pv2 and PV P3 pv3 is died so that means that VM needs to be migrated to pv2 and actually while this is happening I want to show you another thing if you for example right click on Alexi container and do Migra you will see restart mode that means that Alexi container will have to restart shut it down move it and then start as fresh on on a new node in this case will be pv1 with VMS when you when you migrate it goes online mode that means that VM will be online and still accessible during migration you might see a little blip in a connection when actual migration happens that Split Second when migration happens is goes as you can see sephr I was complaining as well that one the one OSD is down quum one and two so the quarter or third of the um of the pool is down one o is down and it's degraded and it's complains like hell that something has happened but still SE is functioning so let's wait for prox MOX uh prox MOX X ha to m this VM you go VM is migrated to this and right now what I'm going to do I'm going to bring P note fre back online so note fre is coming back online and right now if I'm going to go in SEF SEF pool will detect the node is back proxos will detect the node is back and Alpine already getting migrated to be node three because of the group policy that it needs to migrate to higher number as you can see SEF is getting healed in front of our eyes and it's automatically getting all the stuff done so it's almost everything in green so that's it SE automatically detected that the the node is back with its OSD and a healing started and right now in a matter of seconds it should happen everything in green so this is green this is still showing up here that's fine and at the moment as you can see everything is well it's doing a pairing again yeah so he's doing a cleaning thing and that's it SEF will go and heal itself with all the data make sure that what was what what is written on OSD on this node and this node it's exactly been my copied to this and the all three drives are in sync and my Alpine is back to note three because of the proxmox high availability Group which tells the system that this this VM needs to work only on pv2 and three and if pv3 is offline move to pv2 and if pv3 is come back online move it back and that's it and it's all ticking nicely and self is just going to carry You On Healing because because the systems that I'm using for this demo are not fast enough for se you need to have a bit more beefier beefier nodes that I have here for this test you need at least I would say 16 GB of RAM in each node and a bit more course than I have here but as you can see it does function with this system let's see how it's healing I it's still doing all sorts here and making sure it's all ticking nicely okay let's wait here we go under sized so it's keeps it's going to keep going going and going and going until finally it gets healed on my main prox MOX uh after one my mess up it took about 3 hours to get everything healed and fully functioning but right now pv3 still thinks that is down anyway thank you much for watching I hope you enjoyed this very lengthy video about prox moo cluster let me know in comment section below what you think about this video it took me a while to record this video to be honest so hopefully you enjoyed watching this video if you have an idea so you would like me to show you how one thing or other thing works inside the prox MOX please let me know comment section below I do appreciate every comment every feedback and if you found this video helpful consider subscribing or at least just click the the like button all the information that I use for this video you will find description below below the like button and like always I'll see you in the next video goodbye

Info

Channel: MRP

Views: 3,205

Rating: undefined out of 5

Keywords: proxmox home server, proxmox server, proxmox zfs storage, proxmox nfs storage, proxmox ceph storage, ceph on proxmox, zfs on proxmox, nfs on proxmox, setting up proxmox cluster, proxmox cluster setup

Id: a7OMi3bw0pQ

Channel Id: undefined

Length: 46min 48sec (2808 seconds)

Published: Mon Dec 11 2023