XCP-ng: A Different Kind of Virtualization Platform?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

A Different Kind of Virtualization Platform?

different than what? xcp-ng is xen distro.

👍︎︎ 15 👤︎︎ u/Jose_D 📅︎︎ Sep 07 2021 🗫︎ replies

I do love XCP-ng.

My bigges issue with it is how poorly it handles power loss.

👍︎︎ 4 👤︎︎ u/Eideen 📅︎︎ Sep 07 2021 🗫︎ replies

I'm excited for the potential: xo-installer. This looks like a great next step for my local VMs.

👍︎︎ 2 👤︎︎ u/mcstafford 📅︎︎ Sep 07 2021 🗫︎ replies

I rather manage XCP-ng with XO community than the OpenStack. With XCP-ng I don't need to convert images to qcow, I can easily export to virtual box or VMware. In XCP I can easily do operations without random errors. With XCP-ng I don't need to manually fix a disk that failed to spawn. I will throw a party the day we get to shutdown OpenStack in production at work.

👍︎︎ 2 👤︎︎ u/GamerLymx 📅︎︎ Sep 07 2021 🗫︎ replies

Didn’t expect to see Wendell here :)

👍︎︎ 1 👤︎︎ u/syscreeper 📅︎︎ Sep 08 2021 🗫︎ replies
Captions
if you're into virtual machines and virtualization a virtualization headset to call it virtualization distro of linux because it's so much more than that but a virtualization platform that should be on your radar is xcpng it's been on my radar for a long time i've messed around with it it's different than proxmox and it's different than hyper-v it's different than even kvm on linux although technically xcpng is a really lightweight linux distribution that has uh you know zen orchestrator software it is open source but there are paid versions and there are paid versions that have more features than the base version uh but you can sort of color outside the lines and build your own xcpng let me talk about what makes it what it is and how it's different than something like proxmox or vmware or or hyper-v let's let's dive in all right so first of all our setup here is based around the really awesome 2u tie-in server that i've covered before this is four epic nodes in a 2u chassis it's basically four independent computers the really awesome thing about this chassis is that it's basically a cluster in a box so we're running you know four independent machines that each have their own independent local storage i've outfitted mine the m.2 slots have intel optane and then i have four four terabyte p 4500 u.2 drives in each one of these nodes we've also got epic milan i've got epic milan and almost all of these okay one of them i've got an epic rome 24 cord the 7402p that i bought with my own money but the other cpus are cpus that amd sent me to do experiments with and and that kind of thing this system this setup potentially doesn't require a san all the storage is local and that's what you need if you want to absolutely maximize the iops that are available to your local virtual machines you see we're kind of a crossroads for hybrid design and some of the other stuff that goes into building your virtual machine infrastructure and this is this is knowledge that doesn't necessarily apply to xa png it also applies to vmware to a lesser extent hyper-v because microsoft's trying to do their own thing with storage spaces but i digress with those solutions we're talking about io over the network so you have a separate storage system we did a review of the synology you know not the nas but the sand synology has a sand product that is pretty locked down you don't run custom software on it like you do the nas it's a san platform it's got 12 bays by default you can add more there are two zeon d nodes for redundancy so this is two physical computers that provide redundant interfaces to the disks and basically connect it to the network and then you can have one or four or eight or ten or you know any number because it's a network connection of machines that connect over the network to san historically sand technologies have included fiber channel for connecting from servers to disk and even in more modern times just a direct pcie fabric if you really need the speed but see what's happening is the demand for more and more performance is putting more and more pressure on vendors to deliver better and better sand performance and we also have vendors like vmware that are looking at things like vsan which have the potential to deliver uh best case scenario local numbers of iops so when we're talking about these ssds we're in the hundreds of thousands to millions of iops range and when we're talking about a network san we're talking about you know tens of thousands to maybe getting close to a hundred thousand iops and that's just because of the overhead of the fact that it's on a network but sometimes you know ios performance is not what's the most important to people it is resiliency redundancy it's like the physical server hosting the virtual machines goes down it's important that all of that data that was stored be immediately available and accessible and all that kind of stuff you can do that with a vsan as well because as rights happen on the local machine even if it's happening at you know 100 000 iops or five hundred thousand iops those iops are replicated over the network at least until some buffer or threshold is hit so for example if you want your virtual machine which is receiving you know bank transactions or something to never be more than 30 seconds behind a highly available replica over here it can do 500 000 iops until that 30-second window has passed and if the network connection to storage is the limiting factor and it's let's say that it's limited to only about a hundred thousand i o operations per second then it'll drop down to a hundred thousand i o operations per second until the replica catches up because you don't want to ever be more than 30 seconds behind in a fiber channel type situation or other you know if you've got a pcie fabric that you're using for your network storage as soon as the storage medium says yeah we got that then the software trusts and says okay yeah i'll we'll run with that in the case of the synology saying when the thing happens that says yep we got the data committed it's all good it's all on disk it is physically on one set of disks but physically two different servers can access that one set of disks so if one server goes down the other one can access it in the san you've got the two controller nodes that are providing network access to physical disks like physical access and each physical disk has multiple paths but overall you're limited by that network interface it's slower than what you would have physically attached to discs okay it's maybe not slower when i'm talking about mechanical discs but it is slower when i'm talking about nvme but then you have this whole thing nvme over fabric and this video is about xcpng not those things but the reason i'm giving you this background is because xcpng is not very opinionated about how exactly you do it which is one of the really cool things so let me back up for a second philosophically xcpng is more like vmware than hyper-v or proxmox and so in a nutshell if you've heard of proxmox looking at it as a virtualization platform it's basically a linux distribution and it's not really a particularly lightweight linux distribution you can log in with ssh run full commands you've got basically access to literally everything that linux offers it uses you know uh kvm for virtualization and there's a nice web gui that the proximox people have put together it works really well you can use zfs for storage but you can go off script and do something with the linux distribution underneath which is debian and really come up with something that fits your needs you can customize it it works really well whereas if you look at something like vmware esxi your esxi host is fairly rigid you install esxi and then you install software in virtual machines which will manage one or more esxi hosts so you wouldn't for example you know install packages to be able to do more stuff from the command line on your esxi hosts xcpng is somewhere in the middle but it's closer to the vmware side really than the proxmox side it's a really lightweight you know linux micro distro and then you actually set up virtual machines for management and that's where the licensing comes in because if you go to the xc png website and you follow the directions you'll get a fairly lightweight you know free edition uh management virtual machine for controlling one or more of your xcp ng hosts because we're working with that really awesome tie-in machine we've got four hosts i got four hosts that i need to manage now if you want to there are alternative scripts that people in the community have put together that will unlock all of the features that's what i did and so you can set up that virtual machine you can build your yourself from source basically it's not supportable you don't get support buying support is actually worth something but then you run that and then you've got the whole enchilada of all of the different stuff that you can manage now when you set up xcpng the installer is quite good you can even set it up as a raid one so what i did was i set it up on the optane drive and then i dropped to a command prompt and i used the linux md admin tools to create a linux md array from my intel p 4500 ssds so you can store your virtual machines on a different array than than the xcpng stuff is actually installed on and that works fine and you can also install other plugins so like if you would rather do this with zfs instead of linux md you can do that there's not really an installer gui for that but you can do that it does not support zfs on root but you can create a storage pool for your virtual machines that is zfs based and it is a little weird because if you're using zfs for the virtual machine snapshots um the gui for that doesn't really quite translate exactly um so you really end up relying on xcpng for that again it's a philosophical thing but there are also solutions that will let you build a virtual sand and the virtual san here in the xtpng context is that you know you're replicating data so that you lose an entire node so instead of planning for a disk failure you really plan for a node failure and you treat any failure in a node basically as a failure of the entire node so if a disk within a node fails it's not like the node keeps running it's okay this node has basically failed let's start running stuff somewhere else but the functionality means that any of the data that is still accessible on that node is still you know uh can still be accessed by the cluster at large in terms of i need to be able to run a virtual machine or i need to shut down a virtual machine it's on the failing node and bring one up somewhere else but you think about it and you think about all the scenarios that i described where okay we're talking about sand and i need more iops and you know i want the thing to basically go as fast as possible can i get a million iops over the network you know 10 or 25 gigabit ethernet probably not but then you think about this failure scenario and it's really still as good as the worst case scenario for local replicated storage so let me let me say that another way i've got my virtual machine running on my xcp ng node and it's really using local storage but it's also replicated elsewhere in the cluster something happens and i lose that information on that node i'm depending on what policies i've set the replica of the data could be up to the minute could be 30 seconds behind could be a minute or two or five minutes behind could be last backup behind depending on how you set it up it's not really set up that way in the gui like you don't actually configure it that way in the gui just as an administrator you think about how you set up your nodes but this machine fails and i've lost access to local storage the virtual machine can actually keep running on this node assuming that the node is otherwise completely functional because it's able to read the information and write the information from a replica that exists elsewhere in the cluster so we have four nodes in our tie-in server chassis which is really it's not it's it's it's special and awesome because it's four nodes in a two to you a physical space but it's just the same as if i had four rack mount servers in an ideal world the virtual machine that is getting all of its io from remote you would migrate do a live migration over the network again to another node that has local storage or at least has most of it local until you can get the replica going or until you can restore whatever went wrong with that particular node that's sort of the thing that i look at when i'm looking at systems like these it's like how well does that work an xcpng works really really astonishingly well so it's very impressive here in that i can pick what i want for storage if i want to do a vsan type solution there's at least two different software plugins that will give me that if i want to do zfs and zfs replication which is not real time not the way that i described it i can do that and as an administrator i can diy that and that works really well and that gives me all of the iops of local storage but with the rules that well it might not be perfectly real time i might be running a few minutes behind or hours behind or last night behind depending on what my replication strategy that i've implemented is if you don't want to think about it xcpng the the company offers some paid options to help you configure things for real-time replication yes you can use something like ceph yes you can use something like distributed remote block devices and the rules for well it depends on your network connection and your iops and things like that all of that still applies but it's a you know delightful linux underbelly that's a little bit more accessible and a little bit more flexible than something like what you get with esxi of course vmware with esxi they've been around longer they've been there done that a lot more um is it a little bit more of a mature solution i think so but it also costs a lot more there are a lot of people that have older vmware clusters and they're moving them to xcpng just strictly for cost reasons because they don't need anything super complicated microsoft's also kind of eroding that a little bit with hyper-v but i've done these tests with hyper-v and storage spaces and you know i struggle to get more than 50 or 60 000 iops on a uh you know hyper-v and storage spaces system like this storage spaces works great for like a giant document share across your company it doesn't work nearly as well in my opinion for virtual machines but i could be doing something wrong and all of that it's about to change with server 2022 so it's something i got to revisit with the new versions of windows server that are coming out but xcp should be on your radar and i can walk you through the installation for reasons that i've explained but you should also check out lawrence systems you know sort of a fellow a fellow youtuber confidant you know whatever he's done a lot of videos on xcp ng in fact he just did a video on the performance of nfs versus iscsi for storage in terms of like the mechanism that your nodes connect to your sand now i'm of the opinion that if you're using a sand like that should be going away in favor of technologies like vsan it still has its place in current infrastructure like you shouldn't not have a sand it's just that if you want your fastest highest performing virtual machines that need a lot of io it makes a lot of sense to run that locally and then sort of make the decision about how much of a replica you need elsewhere because nothing beats local i o but i was really impressed with how robust xcpng is and you know how easy the setup was just be aware of the fact that if you follow their documentation for installation you get the community version which doesn't have quite all the features that the unlocked build your own version does and you should definitely do the unlock build your own version because oh it's nice i'll get some other content coming up for xcpng but i just wanted to do a sort of a quickie chat about xcpng and i'd like to do more of this kind of content for the linux channel so if you have time or experiences that you can share i would love to draw upon that for future videos because it's a lot of work to do these kinds of videos and it's a lot of work to really dive in and do the analysis of how many iops per second i'm going to get and what the performance is and how does it actually work under the hood and so i'd love to connect with you and you know sort of pick your brain and benefit from that if that's something that you have a lot of experience with i'm well this is level one this has been a quick look at xcpng sort of the first of several videos for xcpng that i've got planned and i'm finding out you can find me in the level one forums
Info
Channel: Level1Linux
Views: 46,595
Rating: undefined out of 5
Keywords: technology, science, design, ux, computers, linux, software, programming, level1, l1, level one, l1Linux, Level1Linux
Id: XLQp_jI5vNs
Channel Id: undefined
Length: 16min 9sec (969 seconds)
Published: Tue Sep 07 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.