Backup in Linux Servers - Docker Volumes, and Databases

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video it's all about backups on linux servers and i didn't just want to create a simple rsync tutorial because i wanted to really emphasize the importance of the different backup strategies especially if you're running a linux server with some darker containers persistent volumes or even databases that you need to backup so in this video i will show you a quick and easy method how to securely and automatically create a backup for your entire linux server and also make sure you're watching this video until the end because then we also talk about automatic backups for databases and there it probably gets a bit more complex but it is really important so don't waste time anymore we have a lot of stuff to cover let's go hey everyone this is christian and in this video let's talk about backups on linux servers and i want to show you how i manage my backups on my linux servers especially when i'm running some darker volumes or some databases on these servers but before we start with the actual tutorial we also need to talk about backups in general and talk about some backup strategies so as i t professionals we probably all know about the importance of backups i hopefully don't need to explain this to you but i still hear some misconceptions people have especially when we're talking about redundancy versus backups and copying files away and the first big misconception people have is that they consider a redundancy a valid backup solution so if you don't know about this let me explain it so if you're running a linux server with a redundancy option so for example you're running a zfs file system or you're even running a hardware rate where you're mirroring two hard drives you always have the same data on both hard drives and in case one hard drive is damaged or fails you can easily replace it and you are not losing data because it's mirrored on the second hard drive but this is not really a valid backup solution because you're always copying all the changes so for example if you're accidentally deleting files or you're doing an update and something goes wrong and you want to revert those changes you cannot actually do this with a redundancy option so always keep that in mind a redundancy is not a backup the same can also apply to simple synchronizing jobs that you do every single day to a separate location so for example if you just have a simple job that synchronizes your entire file system and overwrites the changes every single day you're probably able to revert some changes that you face immediately but if you're accidentally deleting some files you probably won't notice it at the same day so when you want to go back two weeks ago and want to restore a file that you have deleted and you're always overriding the backups every single day you'll not be able to restore those files so what we need is a different solution we need a valid reliable backup solution that also retains the previous states of your file system and this is why we need an incremental backup an incremental backup always starts with a full copy of your entire file system to a separate location but once we do the second or the third backup we are not just copying all the files again we are only copying the changes by still retaining the preview states of the backup so that means we can also jump back in time and for example just restore a file that we have backup two weeks ago and then later deleted or so the location where we store the background should be completely independent from our actual server so that can be usb stick a second hard drive or better a network attached storage or even a cloud bucket where we just store and retain those backups because if anything goes wrong with the hardware of the server and we need to do a disaster recovery we need to have all the necessary files in a complete separate location that we can still access when something goes wrong with our actual server on linux there are many different tools existing that can create such incremental backups so there are a lot of free and open source tools as well as commercial products which come with some advanced features if you want me to do a comparison video about some specific tools then please put them in the comments i would really like to do so but in this video i want to show you a free and open source software that is called duplicity and this is a lightweight backup solution that can be easily deployed on linux servers and even windows servers and i've used portena to deploy this in a darker container so by the way if you don't know about portena and docker you definitely should check out my other videos about it because as an it professional nowadays you need to know this stuff so let's first start with the deployment of duplicity on my portainer server and then i will show you how you can easily use that to automatically schedule your backups so let's go duplicity is a free and open source software which creates encrypted backups and can store them online it's working on windows mac os and of course linux and you can just go to the official homepage and download it in the latest version 2.0 consider that this is still in officially tagged as a beta version however it's pretty stable and it works very well for me it supports many different backups like you can create backups on local storage network attached storage devices or you can create a backup with a strong encryption and upload this to a proprietary cloud provider like microsoft's onedrive google drive amazon s3 bucket and many others it's based on free and open source software and it also comes with a nice web interface which makes it extremely easy to manage your backups in the download section you will find an installer for windows common linux distros mac os and even a synology image but you'll notice that it doesn't have an officially maintained docker container image so therefore i looked at my favorite source of cool and awesome docker images for home servers and this is linux server.io so i've already used many of their darker images in my previous tutorials and they always provide many docker images that are well supported and maintained by a large community and they also have docker image for duplicity which is basically a lightweight container image that has duplicity automatically installed in the official documentation you will find a docker compose file you can just copy and use it for an easy deployment but i'm going to use portain of course to deploy it on my cloud server by the way if you want to know more about portena which is a nice free and open source interface to manage darker containers and even a full kubernetes cluster check out my video about it i've put your link in the description down below so as you can see i'm already running a lot of docker containers on my server and i want to backup them but of course you can also configure duplicity to backup files of your entire server not just the volumes to deploy it simply create a new container and use the image of linux server.io by the way you don't need to write down all these settings here as always you will find the written deployment tutorial on my blog i always try to make it as easy as possible for you guys so there you will find all the links the compose templates any scripts and commands i'm using in this tutorial okay so when you deploy duplicity on portena you just need to do the settings according to the compose file like setting the free environment variables the tz for time zone which is important of course when you want to schedule your backups and also the pu id and pg id which configures the permissions the docker container should have on the system and here it gets a bit tricky especially if you need to backup files that require root privileges to access so for example if you have deployed nginx proxy manager that manages your ssl certs you can usually only access them with root users for security reasons of course and in the linux server io's compose file it's set to the user id 1000 which is usually the user and group id of the first linux user you manually create during the installation so therefore i needed to change the pu id and the pg id to the number zero which stands for the root user now we need to connect our docker volumes and the first volume is mounted to the slash config so this is a configuration volume for duplicity itself where all the config the backup jobs etc are stored then you can create a persistent volume or path on the host system as well where you want to store local backup so for example i've created the mon point slash backup inside the container and i'm storing this in the var backup location on the host then you just need to mount all the docker volumes you want to backup so i recommend mounting them to a slash source location and then simply just redo this for all the volumes you need to backup files from you can imagine this is very individual based on your setup and requirements so this may look completely different for your server depending on the volumes and applications you want to backup so if you need to backup the files of your host os you can also simply create a new volume mapping for the root directory and mount it somewhere inside the container itself so then duplicity also has access to the entire file system on your server now you could simply just expose the port 8200 like described in the compose file but i'm using nginx proxy manager which is an awesome solution to terminate the trusted ssl cert on a secured reverse proxy and use that to encrypt and protect the web interface of duplicity as you can imagine i have already created a tutorial on that and if you want to know how to set nginx proxy manager up on portheader and you want to learn more about this just check out my video on the channel so in the web interface of duplicity we can manage our backups and you can see i already created two backup jobs but let me also walk you through step by step and let's add a new backup job so in the first settings you can set up a name and also set up a strong aes encryption i would recommend using a secure passphrase especially if you're storing these backups somewhere on a public cloud provider or on a location that could be accessed from somebody else in the next screen you need to select the storage type and this can be a local folder or drive standard protocols if you want to backup this on a network attached storage using an ftp server ssh access or webdev and you can also store the backups on proprietary cloud providers for this example i just want to select local folder or drive but usually i would recommend you to store your backups somewhere else and not on the same server where you are creating the backups despite the fact that on cloud servers the risk of losing data there is potentially lower but in case anything goes wrong with your server you want to have your backup somewhere else that is a completely independent location from this machine so then you need to select your source files and when you expand the computer tab you see a familiar linux file system but be careful this is of course not your host operating file system it is the file system inside the docker container which is not persistent except for all the mount points you have mapped your volumes to so you will find the slash backup folder which is mounted to the host operating location on the slash var slash backups if you remember and we also have mounted our source folder where we have access to all the docker volumes we have mounted in this location so let's select some of the static files inside the nginx proxy manager volumes as you can see i'm not backing up the databases you could technically do this but not there something else need to take care of and consider as backup databases on linux servers is a bit more complicated but don't worry we will cover it in a few minutes of course you can also apply filters to add or exclude specific files and schedule the backup in a specific rotation so these are all incremental backups that are compressed with data deduplication so they are depending on each other and you usually want to automatically schedule them and set up a backup retention job so there are some options how long you want to store the backups on the server i usually pick the smart backup retention which remains one backup for each of the last seven days each of the last four weeks and each of the 12 months which is a pretty good plan in my opinion and it gives you enough options to restore files from specific timestamps so this is how a backup job looks like when it's running and it's also pretty straightforward to restore files from a backup so you can select a specific timestamp you want to restore files from and then you can select if you want to restore them to the original location and overwrite the existing files or somewhere else and you can also select every single file that should be included in the restore job and those which should not be included most tutorials would know just stop at this point but we need to cover it as this is really really important so while this solution is perfect for just copying fights away if you want to backup databases with duplicity it gets a little bit more complex here and this is because databases work a bit different than just normal files on the file system so when you want to do a backup of a database you could technically just copy the database files from the file system way but this is not really good idea to do so especially with big databases where those files on the system can get really really large and there are a lot of frequent updates and changes happening this can be a really really big problem because if you start a backup and then there are some changes to the database happening while the backup hasn't already finished you will face some inconsistencies between those database entries and in the worst case it may even corrupt the complete database you won't even notice it while the backup job is successfully running but once you start recovering from this corrupted database you will see that the backup can be completely useless in this case so as i said this may work for smaller databases where the backup job just takes one or two seconds and there are no changes happening within that time you may be fine but technically this is not really a secure and solid solution for database backup and there are two ways how you can overcome this the easiest solution is to just stop the database container start the backup and once the backup job has been finished then start the database container again but of course this will result in the downtime so your database or your application is unavailable during that backup period and of course you also need to somehow automatically script that depending on the backup job so i use another method and that is taking an online database backup so for example if you're running a mysql database a postgres sql database so even a sql lite database there are command line tools existing that you can connect to the database server take an online backup and then you have a reliable backup solution for your database without shutting down the database server and the backup tool will make sure that this backup is not corrupt and there are no changes or updates happening while this backup is running so i've created a simple bash script that is doing an automatic backup for all my database containers that are running on my system and then i later use duplicity to backup those database backup files to a second location like a cloud storage or so let's also take a look and walk through of the script and then you can actually use that within your own infrastructure or environment as well you will find this backup script on my github page in the scripts repository so don't worry i've put your link in the description below of course this bash script may look a bit complicated if you're not familiar with programming but it just does the following two steps so the first part will just set and configure the backup directory and the number of days i want to retain these database backups and then i execute the docker ps command which will give me a list of all darker containers and i'm simply filtering them based on the mysql and mariadb images and store them in the container variable so this will extract all my sql and mariadb docker containers that are running on my host system and it executes the part two of the script for all these containers sequential the part inside the for loop is getting the database name and the database root credentials from the environment variables of each container which is usually stored there and it executes the command mysql dump which is a cli tool i've just mentioned that will do an online backup of these databases the backups are stored in the backup directory and this is defined in the static variable at the beginning and there's also a third part inside the loop which removes all backups that are older than the defined days so note this only works for my sql and mariadb because i only use these databases if you have other databases for example a postgres sql or mongodb or whatever you should check out the documentation and write your own script for your database all databases should have some cli tools to generate a dump in mongodb this is dump for example on sql lite you can do it with a backup command and so on but you really need to find out the right cli tools for your databases that you're using on your server here's also a quick example of how you can use that script to backup a sql lite database for example in a bit wharton instance because i'm also running bitbotton ros on my server and i've created a part 2 in the script that also does an automatic backup for this instance it works the same like the instructions above but i've just skipped the part with the credentials here because bitboard and rs databases can be accessed without username and password and then i just replaced the sql dump command with the equivalent command that works with sql lite databases so i know that bachelor scripting may be new to some of you guys but trust me it is really important that you learn this stuff especially if you want to manage your own servers and if you want to go down the sys admin or devops path in id you need to know it what i will do is i will create more tutorials on bash scripting and also python of course so please let me know in the comments if you're struggling somewhere or which tutorials you want to see in the future i also try to do more live streams on my twitch channel where i do programming scripting and tinkering around with this stuff so if you want to learn it you're not alone just join our discord and jump into some of my streams and let's learn together with other at professionals and let's just help out each other with this stuff so once you have written your bash script and you tested it it's creating those database dumps in your desired backup location and you can see in this example it's creating three database backups one for my bit warden sql lite database and two mariadb in my sequel dumps for the ngx proxy manager in a wordpress website on my server you could now just use this script to backup all your databases but if you want to manage these dumps with duplicity let's also create a new backup job then you can store these dump files somewhere else and for testing purposes i will of course just use a local folder or drive but make sure you are creating a separate folder for each backup job where you want to store the backups then we can select our source files and because we have mounted the root directory into the slash source slash host files location inside the duplicity container we can see the entire host file system under this location so now we just want to select our free database doms here that are located under the host os homes folder my username and backup and you can see all three dump files are here let's select them and create a backup job for them of course let's also schedule this job and set up a smart backup retention for it and once this job is created this backup just takes one second as these databases are not really big and it's also the same easy process if you want to restore those files like in our previous backup job so if you need to restore a database from these dump files you can decompress them with gzips dash d and simply execute all these sql statements that are inside these dump files okay so now we have a script that can dump a database and we can automatically run this with a cron job or create a systemd timer entry so i may do a full tutorial on systemd jobs in the future but here's a quick walkthrough so first put your scripts somewhere in a folder then i created two files that are called the dot service and the dot timer and put these files in the folder etc systemd system directory you can find these two templates for the systemd job also in the same github repo like the database script so just use this as a template and replace my username and the locations with yours the first file the db-backup.service will define a new unit for the systemvee service that executes a simple bash script in this case our backup script that we have stored in my home folder it's also executed under my current user the second file the db-backup.timer defines when we want to run this unit so the on calendar attribute can be changed to daily hourly weekly whatever you need this is just an example which executes our unit every day at midnight now we need to copy these two files inside the etc systemd system directly of course you should do this with root users privileges and you can see there are also other jobs running to enable this you can use the command sudo systemctl enable and then the name of the unit you can also check if this is successfully enabled with the systemctl list timers double dash all our backup timer is listed here but it didn't automatically run so you can also do this by executing the systemctl enable command and add a double dash no so then you can see that the next timer will be executed at midnight and if we check our backup folder you can see that it just created three new dump files we can then later back up and encrypt with duplicity to another cloud storage location or network storage but i know this was a lot of information in one single video but managing databases is not a trivial job and you should know about how to script and backup and restore a database with cli to it especially if you're responsible for the database storage on your servers okay guys so i hope this helped you to create a reliable backup job for your entire linux server if you enjoyed this video please don't forget to hit the like button and subscribe if you want to see more tutorials for id professionals in some of my next videos we will also cover backups of virtual machines or with other file systems like vfs snapshots and so on but i wanted to first off create this tutorial as a starting point that can be applied on any linux server regardless of the file system or the environment underneath and that is also working in the dockerized environment and as always if you have any questions or you're having trouble just join our discord community we have a very friendly and respectful group of i.t professionals there i'm really proud of you guys so that's really really awesome so thanks everybody for watching enjoy the rest of your day and take care of yourself bye
Info
Channel: The Digital Life
Views: 7,864
Rating: 4.9786096 out of 5
Keywords: backup in linux, backup in linux command, backup in linux server, how to take backup in linux server, schedule backup in linux, backup and restore linux, backup files, linux backup, linux full system backup, rsync backup, rsync backup linux, back up, rsync tutorial
Id: JoA6Bezgk1c
Channel Id: undefined
Length: 22min 42sec (1362 seconds)
Published: Mon Apr 12 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.