Proxmox Backup Server Tour and My Experiences with PBS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today i'm going to take a look at pbs or proxmox backup server so pbs works as a full backup solution for your proxmox virtualization systems and compared to the old solution that they had which was just to create a little image in proxmox it's a lot better of a backup solution it can do incrementals it can do easily remove old backups it can kind of keep only what backups you want it can also backup arbitrary files on a debian based system and it's overall a pretty good solution in my opinion but i'm going to dive a little bit deeper into the interface what i like about it what i think could be improved about it and just some other thoughts i have about pbs so what is pbs4 so proxmox really hasn't had a great backup solution until pbs came out originally it had no backup solution which if i take a look at it on a proxmox system here i can run backups and it essentially would just capture a full image of that system and it'll compress it with the config file and this can be nice style if you want to migrate between systems but it had a few downsides first of all it was only full of backups so if you wanted to back up let's say 100 gig vm for a week that's 700 gigs of backups whereas if incrementals it might only be like 150 or 200 gigs of backups because it only stores those changes every other day and pbs does have incremental backups like that so it's much more efficient with storage space the other thing was because it was a full backup it was slow it has to read all the data and then write all the compressed data a lot of other backup solutions like veeam and pbs only look at the changed blocks so they look at it says hey this is a map of change blocks so for a vm of 100 gigs maybe only one or two gigs have changed so it only needs to back up those 102 gigs because it can assume everything else is perfectly the same because it keeps track of what's been changing on disk and because proxmox is a relatively small player in the field veeam and other commercial solutions haven't really made a proxmox version so you really didn't have another great option until pbs came out to do good backups and because of that i thought it was one kind of downside of proxmox there wasn't really a great solution and luckily they've built one now it works really well with proxmox it's pretty much plug and play and it uses a lot of the same interface and design that you're used to if you're using the standard proxmox virtualization in interface with it now you don't have to use proxmox ve with it but it's a little bit odd because proxmox backup server is designed to back up your containers your vms and then you can backup arbitrary data in a debian based system with a client and this can be nice for something like a nas but really if you're doing heavy nazis or something else you might want to look at other backup solutions in pbs but if you already have a pbs system it might be nice to just add something else and i'm glad they added that in on my list of wants i'd love to see like a windows client and maybe like a center s2 it'd be cool if you could kind of use it to back up everything i'd be super happy if they made it for other hypervisors too but i think just having more support in the client would be great with that out of the way let's take a look at the interface of pbs so on my screen right now i have the dashboard of a pbs system so i see a few things first of all i just see system info like what cpu i have what version it's running my usage on this system because it's not doing any backups right now my utilization is very low on here it's not doing anything i see the longest tasks which i can click to open and view the task here and it says looks what that's all the data that's been done and it shows me if there's been an error on tasks a summary of all the failed tasks so i can look at a failed task and see what happened one thing i wish it would do was highlight better what the error was some of these aren't too bad but some of these like i believe this one here that had a warning you get this really long list and it doesn't highlight it i think it would be cool if it highlighted the row here in like yellow or red whatever the color goes with it would be to say like this is the warning one here it might or maybe just when you open it here it shows the warning here on this menu as well i think just a way to better see the warnings would be kind of neat a lot of these arrows you see here from my testing of here because i kind of push the software to its limits during my testing you should see less errors than this if you're running it in a production environment it's been pretty solid in my use overall next thing data store usage so this is your list of data stores that you have how big they are how much data you've used how much available space percentage and estimated full so what it'll do is if you make new backups it'll estimate when it'll fill all the way up so then you know how much time you have remaining which is cool because then you can know how to set it up so that it never fills up and i like the little graph they have of usage too it's kind of nice to see i filled it all the way up and then it's kind of sat at about the same since then running tasks if you have any it's currently idle so i don't have anything and then no subscription so i can pay for a subscription and get paid support and the full repository for it but i have currently not done that on this system right here and then i can look a little bit more at the configuration through some of these tabs so this is going to be very familiar if you've used ve very much it's a lot of the similar style ue user interface a lot of the same style user management and then you get remotes so remotes you can set up are essentially a second system a second pbs system it will sync to so to sync all the data that you've backed up on the one system to another system and that's great for having something like an off-site backup and in my use i have a co-located server that's connected via vpn that i have it so it's syncing all my backups over that connection and it's smart so it only syncs the changes you've made and doesn't sync everything so that's one really cool thing i'm going to get a little bit more to the remote setup in a bit traffic control lets you limit the speed of remotes and the max speed that they can transfer at certificates and subscriptions if you have any administration lets you access a shell of the system so essentially just a local command prompt and then storage and disks so this is for the actual physical disks on the system so this system here is using zfs to manage the raid and volume management and everything so you see here i have a lot of these zfs drives it'll tell you if they've passed smart what their serial numbers are on all my drives and i can see i have a zfs pool here everything's online and it's set up here i can also use this to create zfs in the menu here and zfs seems to be the suggested way of managing multiple drives otherwise you can do whatever you want on linux to set up drives so i can add a arbitrary directory in linux and make it a storage for pbs so if i wanted to use like mdadm or another file system i could do that here and then tape backup so i don't have a tape drive to play with so i can't really comment on it but it does say it sports tapes and will put your backups on tapes if you want it to data stores lets you manage your data stores and which data stores are being used so i get a summary here if i look at data stores and if i click on the data store this is where you get i'd say a lot of the options i want to look at and poke app so now i'm going to take a look at the main data store in the system i'm storing most of my data on so on the summary page i can just see a summary of the data store so that's great i get a nice little summary of the utilization graphs of like iops transfer rate utilization i get to see how much is stored on it how much usage is but i think it really starts to get cool when you look at content so i've backed up everything on here so i can look at like vm 100 and that's where i see all of my different vm snapshots from it looks like i had it every two hours for this vm it's up to 32 gigs and then i can see the data in it so it has a log file it has the actual image the index and then the commute server which is the configuration file for that vm so it has each one of those in every one of these and then this is deduplicated so it doesn't actually use the full space and then at the top i get an option to verify all or i can do it here for verify i can change owner for permissions and then i can prune it so pruning basically lets me take a vm which has a lot of snapshots of different times and only keeps some of them so i don't have a great example here but i can say keep hourly of the last like four hours so to keep me the last hourly backups and last four i don't have enough examples to look at other ones but it's super cool that they actually tell you which ones it keeps and which ones it doesn't but right now i'm going to only keep the last one to show how it works so now i have it pruning so it's going to delete all the chunks from the ones that aren't needed i'm going to talk a little bit more on how chunks work in a little bit and actually that task i believe was super fast but it won't actually delete it from disk until you run garbage collection which actually deletes the files from the disk so let's take a look at that now one other thing to note before i look at that is i can see i have a few icons so this little cube means a container the little kind of pc mac a desktop thing means vm as in the little tower means a host vm so that's that arbitrary data backup so i was just using it to back up my main host in this name brand and server 5 and i had a directory so i had this pxar directory where it puts everything together and in this case it's 4.3 terabytes so garbage collection is where it goes and actually marks which chunks aren't needed and removes them so now that i've talked a little bit about trunks let's take a look at how proxmox backup server stores the data on disk so a lot of backup software like veeam and others will make a full backup the first time you run a backup and every once in a while and then it'll make incremental backups so to take a look at that full backup the current state of the vm and make a little file of all the changes and that works pretty well depending on your use case but it has a few disadvantages which is if you want to keep one arbitrary v state you have to keep all the other versions in that series you can also make deleting old ones kind of a pain so if you want to keep one from like a year ago you might have to keep a whole chain of the last month before that year ago one if you want that arbitrary backup so the trunk storage works a little bit differently instead of having a big image or something like that it splits it into little chunks so in this example i made right here my first vm state needs 10 different chunks my vm state 2 needs 10 chunks too but two of those chunks are different and it doesn't need two of the original ones and then on disc it's storing 12 chunks because it has the original 10 and then the two more that are needed and what garbage collection can do and pruning is it tells it hey if you want to prune it maybe we don't need to state two right now so then we can delete the extra two on disk so it takes a little bit of time it looks at which ones it needs and then it marks some for deletion and then it actually goes and deletes them these chunks are typically up to four megs in size but on average it seems to be about 1.4 to 1.7 megs in my use case but it really depends on what exactly your vms are and how they're being stored so if i take a look at the screen now it's marking which chunks it doesn't need and then it's looking at the unused ones and will start to delete those unused chunks to actually save you the disk space now the big disadvantage of this compared to the more traditional full backup and incremental is this can take quite a while i've done some timing depending on the size of the vm but it can take multiple hours on a slower system like i have running on pbs and with large backups it's going to take quite a while so if you're doing tens or hundreds of terabytes of backups you're going to probably want to do some estimation of how long it's going to take for you to do it for just a couple terabytes like i have here it's reasonably fast but you just want to make sure that this is a reasonable solution for you and the amount of time it takes to process this the great thing is this is quite a bit more efficient disk space wise compared to the full and incremental because it only stores what it needs so if you only need the last like one month and then a couple from the last week and then everything from the last tower it only stores that and here taking a look it actually shows my summary so it says it removed 42 gigs of garbage which was likely that vm i just pruned it removed 32 000 trunks it was originally using 33 terabytes so this 33 terabytes is the unduplicated deduplicated chunk of data but because of deduplication of 5.34 times it only needs 6.2 tibi bytes of data and the average chunk size in this case actually 3.3 megabytes so it's a maximum of four megabytes i believe and i've just saved a little bit of disk space the next tab here is sync jobs so this is where if you have a remote system you can sync it to the remote system so let's take a look at one of my other systems that has a sync job going so this is another copy of pbs running right now and this copy of pbs is running in my co-located server somewhere else connected via vpn i've added it remotely under remotes here so i can see that this has a remote host and then i've added a under the data stores a sync job with this system so i have it connected here and i said every hour i want you to copy it as fast as you can and because i don't have the fastest connection with this system i can see that it's by on some relatively slow jobs and it'll just copy all the snapshots so it'll copy only the changes that it needs to copy so it's relatively efficient when it comes to time and changes and it essentially will compare the list of chunks that it has on the remote system with the local chunks and just make sure that it has everything it needs remotely so that's pretty nice that only copies what it needs and it's fairly efficient and the big thing i get from pbs's design is that it's generally pretty darn efficient when it comes to the amount of data it has the data needs to copy and the amount of file space it uses on disk which i like being someone who doesn't want to have a ton of extra disk space the next option is verified jobs so a verified job will just do a read and checksum so this doesn't make sure you don't have any silent file corruption or any issues with it after it's been copied or synced or anything so in this one i can click the little v button it'll run a verify job and this is something you probably want to run automatically and have an easy way to set it up to run it maybe every week just to make sure there's no disk issues or anything here so in this case i had a small container and it looks like everything's doing okay and i can set up a verified job easily to say hey maybe every every saturday at in the evening and i'm going to re-verify after 30 days so that way if it sees it's been verified already recently it's going to assume it's fine and then i have options which lets me notify users in case something happens and then verify if new snapshots are created and then i have a permissions tab which lets me set up permissions if i had multiple users and this works just like proxmox where you can have groups and assign permissions to groups and say different users can access different data stores so let's talk about one of the most important parts about backup which is just restoring it so you can access the data that you've backed up before so there's a few different ways to do that some of which is in the proxmox ve interface and some of which is in the in the pbs interface so let's take a look at both right now and i'm going to be using this vm 109 which is just kind of my test windows 11 system right now for doing that so if i look at content i can see i have vm 109 and i have a few different snapshots from pretty recently that i could look at so if i want i can actually just take it and i can download this um image file here and it's a compressed image and i can just save it it's going to download me as actually the uncompressed option and it's going to take a little bit of time to download but now i have an image file that if i want i can put into another hypervisor i can do whatever i want with i could extract here i'm not going to do that now because it's going to take a while to download but it's just an image file of the virtual disk that it has i can also see the config file here so if i want to download this config file i can then view my conf file here and if i want i can open it in something like notepad and it's just a config file that it uses proxmox to say hey this is what hardware i gave it so if i want to restore it onto another system these are probably the files i want but if i want to restore it on the same proxmox host or a different host i can do a few things and the two ways to do it is first of all going into the actual vm and then go to back up here and then i can see my backups if you don't see anything make sure the correct storage is listed because otherwise it won't see anything so i can say hey maybe i want the one from 1704 so i can click restore it will store it over the same vm so to overwrite what you have and then i get a few options of if i want to start it right afterwards or if i want to do a live restore and the live restore means instead of waiting for the backup to finish it will immediately start the vm using the data in the backup and then it'll start it so i'm going to actually try doing that right now so to actually erase anything the vm has after that state and overwrite it with what your backup is so you might want to be careful of doing it this way also it's running right now so i'm going to stop my vm first to make sure that i can actually restore the backup and once my vm has stopped i can now run this restore job here restore it and i'm gonna do that live restore right now and it's going to overwrite my current vm with the content and immediately start running it from the actual backup data and then once it finishes the backup it'll start running it from the new one so i can see it's starting to restore the disk right now and it should have the vm running very soon and actually yeah the vm is running right now so even though the backup restore just started a few seconds ago i actually have a usable vm that's booted up right now or is starting to boot at this current moment the other thing i can do is restore the backup into a different vm that started from that state and i'll keep the original vm if i want to have both copies so the way i can do this is by going into the data storage i have here clicking on the vm i want to restore and the period in time so it's case let's use a different one a slightly older copy and i'm going to click restore and i'm going to store it as a new vm in this case 114 and it's just going to be on to whatever the storage it was using before is i can do the same live restore unique will regenerate things like the mac address and other unique variables for that vm and maybe i'll have this one start right after the restore finishes too so now i've restored the backup into a new vm so i haven't overwritten anything so this can be nice if you have a vm that's gone into a weird state you don't want but you want to be able to still restore stuff so maybe it had ransomware or something you're going to still trying to restore from the ransomware because you want to make sure if you get anything you can but otherwise you still have your latest state so it's nice that they have both options here and it's just going to take a little bit to restore this vm and get it running and the other great thing is i can mount this pbs share onto another system view these same vms i have here and restore them onto my other system which can kind of be a nice way to do an offline migration or something like that but if migrations your goal i do more of the traditional backup onto like maybe an external drive move the drive a network share over and then restart from there pbs is more for kind of an ongoing backup and less of kind of a move between systems type of backup so now that i've talked a bit about the user interface and some of the features that pbs has let's talk about how i use pbs for what i backup so i have a proxmox virtual environment system running with quite a few of my different vms i use for different personal home lab tasks i have some vms that i see as more important than others things like a lot of my personal data that i sync from my documents and downloads folders and some are just the more things that i want to keep the data more those things are backed up to a local pbs system running on my actual virtualization host because i wanted to try it as one of the options is to run it on the virtualization host or any other debian system i just installed the packages and it works fine and i back it up on there and i back it up to another little zfs array on my main virtualization server the nice thing of this is i can have it share the same array really easily this instance running on my main system has been synced with my co-located instance so i just have a remote job running they're connected via vpn and it just syncs the data every hour so that my co-located always has the most recent copies of my very important vms sync to it and this thing's been pretty reliable pretty solid haven't really had to worry about anything and it just kind of works i'm quite happy with how pbs works in this instance i then have another system that backs up all my vms and all of my data to it locally this system i leave off a lot of the time just to save power consumption and this system is that emc island i've taken a look at earlier it's a pretty solid older duo 1366 server running dual l5630s 24 gigs of ram and is more than plenty for running pbs here it has a total of about 24 terabytes usable of storage and i used that to back up all my vms and then i also used the data store backup in pbs to back up some of my nas stores so i have a video project stores that videos like these end up on it's on my personal photo projects and other video projects and then i also have just a backup of all kind of my storage data that i have for other uses that backs up to that system there's a bit more manual setup as i like to shut it down to save power when possible but other than that it's pretty darn solid and backs up quite quickly using the incremental backups this is my relatively simple pbs system that i use that works quite well for my little home lab use case is now that i've gone over how i use it let's go over some of my kind of wants feature requests and things like that that i have for pbs and some of the issues i've ran into i really can't replicate any major issues or crashes it's been a pretty solid product for me and ran very reliably in my use case a few one-off issues but when i tried to replicate the issues i couldn't really replicate them so i'm not going to comment about them now as i'm just assuming it's something weird happening and they were pretty rare overall in my use case of pbs 2.1 now on the things i want some of those are a progress bar for sync jobs it just kind of says it's syncing and maybe how many trunks i'd love to see like a percentage and like maybe a little progress bar of an estimated time remaining i believe it does a global lock when you do the vm backups instead of a per vm lock i am not fully sure of how it's doing but it'd be cool if it could do a per vm lock instead of a global lock so then that way you can have multiple backup jobs running on one virtualization host at the same time which is great if you have a lot of large vms you want to backup at the same time another thing i'd like to see improved is how it does multiple incremental backups so one thing i've noticed is it seems like you have to have one job run multiple times for it to actually have the incrementals working and it has to be a running vm if the vm is shut down it always runs a full backup and reads all the data i'd love for it to fix that but you can kind of get around that by just putting those in a different job that runs less as you likely don't need to back up shutdown vms as often but it's just a little messy and something like veeam can do this and the other thing is if you have a backup job running run a different backup job to a different storage device and then run that original backup job again it has to recreate the change thing so it has to do a full backup and can't just do an incremental so if there's some way to make it just keep a separate incremental data for multiple backup jobs that would be cool and sometimes what i've done is have like a backup drop to one solution to another but then again pbs has a solution of syncs so then you don't really need as much of a backup jobs there so for a lot of vms if you just only have one backup job that's not really an issue but if you want to back it up to multiple locations that is something i'd like for them to see improvement i'd like a better way of showing job arrows as i showed earlier i'd like it to just maybe highlight the line that has the arrow or something in just make it pop out a little bit more um and the other thing i'd like to see is that the proxmox backup client on system some sort of way of showing progress either on the web interface here on the terminal command line just show like a percentage done something like maybe how borg works where it can show you with the progress of stats option for the backup client another thing i'd like to see is just the way that borg works is to just ignore files that it's already backed up it appears to do a full read of all the files and compares it no matter what but if it could compare like the date time and say hey it's been says it's been unchanged just trust that the file system says it's unchanged and it is and don't back up that file and then of course a fast do a garbage collection would be great but i believe that's pretty system limited as it seems to be limited by disk io and cpu here as though it probably could be slightly faster overall i think it's a pretty good solution for backups unfortunately there really isn't anything to compare it with because other commercial solutions like veeam and stuff don't work on proxmox but i think it's pretty feature complete to what veeam would do on another hypervisor and there's nothing major i'd say is missing i think it kind of definitely checks the box for me for a good backup solution that stores it efficiently it's easy to manage and keep what data you need there's no major admissions feature wise that i think i'm missing or would complain about so i'm pretty darn happy with what it's given me and i think it continues to make proxmox and i'm happy that they keep putting work into developing it and continue to making it a great hypervisor and it's just i'd say pretty on par in a lot of ways with the other large commercial options so i'm quite happy using it and let me know if you're using proxmox backup server what your thoughts are and if you want to keep using it or if you have any other complaints that you'd like to see changed or improved
Info
Channel: ElectronicsWizardry
Views: 13,883
Rating: undefined out of 5
Keywords:
Id: r7WDulNGV0E
Channel Id: undefined
Length: 25min 45sec (1545 seconds)
Published: Sun Feb 13 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.