TrueNAS 12 ZFS Replication & Encryption

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
tom here from lauren systems we're going to talk about trunas 12 and zfs replication now zfs replication has been around in the freenas for a long time and with trunas 12 moving to opencfs 2.0 they've added the modern encryption methods that are available in zfs 2.0 now i'm bringing this up because you're probably thinking you already know replication and you probably do if you've used it in previous versions but the encryption adds one more thing that you have to make sure you're keeping track of and that's the keys for the pool i want to cover exactly how that works and how zfs replication does not only what it used to do but a little bit more with the keyed replication pools including the fact that you could set a destination pool that doesn't have the keys and you can actually have a server that lands that can hold all the data hold on snapshots hold all the backups without being able to see them so there's some new ones to the way they changed it i want to do this updated tutorial to cover all that before we dive into those details let's first if you'd like to learn more about me or my company head over to lawrences.com if you like to hire a sharp project there's a hires button right at the top if you'd like to help keep this channel sponsor free and thank you to everyone who already has there is a join button here for youtube and a patreon page your support is greatly appreciated if you're looking for deals or discounts on products and services we offer on this channel check out the affiliate links down below they're in the description of all of our videos including a link to our shirt store we have a wide variety of shirts that we sell and new designs come out well randomly so check back frequently and finally our forums forums.laurensystems.com is where you can have a more in-depth discussion about this video and other tech topics you've seen on this channel now back to our content first thing we're gonna touch on is understanding how zfs keeps your data safe but i will not dive too deep into this topic because there's a lot of nuance to it it's the debate that will always come up of why use zfs replications over something like rsync and being about a thousand times faster under the particular scenario that they were doing some of the testing in there's a couple good reasons too i don't want to get too far off topic i will leave these two links here where they break down the testing methodology as a matter of fact this whole arch technica article is really good on those you that may not understand some of the fundamentals of zfs it's got a good explainer with some visuals and some animations uh didn't want to get too off topic in this but at least i'll leave you a link so you can further reading of why you should use it over something like just rsync because there are scenarios that it may make a lot more sense to do now let's get started we have some test data here and the idea is we're going to take this test data we're going to replicate it to this system right here so this is system a and this is system b they're both running truenas 12.0 u1.1 so they're easy there is up to date as is available on february 1st 2021 here and right now this pool has no snapshots and no type of backup tasks set up so we're going to be setting all this up from scratch that way we can show you how to do it of note this pool is encrypted using keys it's unlocked that's why we have this right here and this pool here storage pool does have its own set of keys as well matter of fact this is actually some of my production datas i'm doing a long-term test on this system for those wondering all right back over here we're going to go and start the replication process now the first thing that can be confusing about replications is they're based on snapshots i've got several videos on snapshots and the snapshots in chernas 12 work the same as they did in previous versions and replication still requires a snapshot i bring this up because where sometimes there's a little bit of confusion is people will re-run a replication task thinking it's backing up every time the replication task runs not necessarily it is re-synchronizing the last snapshot that you've attached it to now the good news is when we create this it's going to create the replication pass and the related periodic snapshot tasks you can have more periodic snapshot tests than just the one needed for applications so if you have some other strategies to create snapshots to your data but it is important to know that the replication is being based off of the task that is linked to it so replications need to have a snapshot to run just of note when you're doing this so we're going to go here source location on this system and uh there's our some test data to back up recursive this sometimes creates a little bit of confusion this folder the some test data which i have right here i've just drew a few video files in there matter of fact we can create a new folder some other folder actually let's just looks a little less messy if we move a few of those things in there right we'll just drag them in drag a few more in there all right now we have subfolders i'm bringing this up because this is where the recursive sometimes creates confusion set to also replicate all snapshots contained within selected source this is for data sets not for folders i bring that up because even though this has folders those folders will be and will show the data on the destination system they will be replicated you actually don't need to do recursive unless you're nesting data sets so if you just have one data set and it's a share like this one is with no sub data sets it doesn't really matter data sets do show up as folders but it will automatically without checking and recursive backup those folders all right that's all you have to do on the source side destination side you have the option of on this system or on a different system the reason you have the option for on this system is it's actually a handy way to get data synchronized between two pools that you have so if one system has more than one pool then you can actually duplicate the data between those pools now once again it's all in one system so it's not necessarily a good disaster recovery plan but if you have two separate pools if something happens to pool a and you've replicated it on pole b now you have all the data more ideal for disaster recovery planning is on a different system now it uses ssh as its transport layer with ssh all the connections are encrypted they've made it really easy to build these ssh key pairs through ask mini this is the true nas mini and we'll just use the semi-automatic trueness core system for doing this now what's kind of cool is we're going to go here just grab the url and paste in literally the url and it's going to go ahead and log into that so we're going to go over here i'm just copying the password out of it i use bit warden for those wondering generate new private key what this is going to do is create a secure connection so this server can talk to that server so create connection once it's created it and logged in i can actually hit this and it sees the data sets here so here's all that data look over here at the pools it's able to see it now we're going to drop this in and some test the backup so there's the backup that we created some test data backup don't bother checking this this is out of scope of this particular video because you'll see how the keys work next don't mess with the encryption options unless you know what you're doing go ahead and go next run on a schedule we can have it set here to daily that's fine you have the option weekly monthly hourly and if you do custom there's you can get more granular for how frequently you want this to run now this is the task for both the replication task and the snapshot the snapshot's going to occur and then the replication it has to occur so we're going to head and start replication it's pending right now and it will run at the time that's set but we're going to go ahead and force it to run now replication small drive testing some test data has started all right great now it's running i'm going to copy all of those data now because there was no snapshots on here let's actually first there's that snapshot test that it created will be not test but the test data snapshot that we created here based off this replication test if we go over here storage and we look at snapshots there's the snapshot that was created along with it we didn't have any prior and now we have one now it's created a snapshot it's sending the data over go back over to tas replication tasks finished all the data is copied you can look at the logs download the logs if you want but it's done is copied over here click back on pools just to refresh it hey look some test backup data out of note this is encrypted we got the key right here and if we go here we filter snapshots hey cool we can see the snapshot but what we can't see and this is the important part is the data that is stored in it because we don't have the key on this destination system so i sshed into it so 2.13 2.13. we go to the folder there's my lts videos zens and lab but you notice how the folder is not showing up so we can ls all we want but it's not there that's because without the keys there's no way to actually get in and see that data so that some test data is here but it's encrypted with the keys and we can't do anything until we decrypt it now this is that important part if you didn't back up your keys for example and you were sending all that data and it's landing encrypted and that's perfectly fine but you have a catastrophic failure of this system now i have a video on how to back up keys but if you were to not follow that and not back up the keys that catastrophic failure would also mean you just have a pile of encrypted data on the destination system and without those keys you're unable to do it these are not transmitted with the zfs send and now we can go over here though go to storage tools and we can just go ahead and export the keys password in continue and there's the key or you can hit download and it downloads a text version of the key so pretty straightforward on how to get the keys but this is a kind of a little bit of a bug right now and i have a bug report filed just so people know because i want to cover one other scenario when you're exporting the keys we're going to support them both ways so we can export the data set keys this way as well and it created a json file with the keys so here's that json file with that particular key in it i'm bringing this up because let's show you how we unlock it now we're going to go here we want to unlock this choose file let's grab that json file submit and it fails this is and i'll leave a link to this for those wondering the progress of this bug this bug exists now if you're watching this sometime in the future this may have been fixed after an update looks like it's targeted for 1203 but fyi if you try to pull it from the json file that's broke so let's go ahead and do it the other way we're going to cancel all we have to do either pull it from that json file or when it popped up just copy paste that encryption key or even like this file here was just a text file of that key you just need to paste it in so you uncheck this box instead of a file you just drop the data set key in here hit submit confirms a little green checkbox data set unlocked there you go now this actually is stored within the system now so once you've saved it here and decrypted it when i go to export the keys export data set keys here out of this system and i download this there's two's data set keys here's the first one for that and here's that some test data backup and there's that key again so it does cumulatively collect the keys so once you've done this once you don't really have to worry about the system over here because well those keys are now backed up and copied over now because we unlocked it we'll bring this back over to the screen ls and now we can see some test data backup that that is now showing up and there's all that data that's in there so we have the some other folder and roll the two folders that we had in here but you get the idea the data is now viewable by the way this is important the data on the destination system is going to be set to speed only the reason for that is if you're trying to do a differential every time the systems does its dfsn it's going what changed and what needs to be sent what's different if you are messing with the data over here you are going to create some problems and a bunch of errors it by default sets this to read only because it is read only to you on this particular system so you don't delete or goof with the files the files are visible you can see them you could pull some file and move it somewhere else but you are not supposed to be modifying the data on the destination system now one more thing i had mentioned because this is based on snapshots let's go back over here to tasks replication tasks and you see it ran it finished and we set this to run every night let's actually uh not that folder this folder there we go let's delete all this data let's just purge a bunch of it and uh delete yep all right the only thing we have right now is just this folder and that's it these two folders actually let's get rid of that it's only have one folder even less data now i've made these changes so it's gone that data went with it there is a snapshot i can roll back to that snapshot and this is where sometimes the confusion has come in when people say i thought the replication task would grab it because i went ahead and made a change i hit run now i hit continue and nothing seemed to happen reason nothing seemed to happen because there still isn't a new snapshot so we go back over here to storage snapshots there's only the one snapshot here so when we go back over to this system here good news is data's still here you can keep running the replication task over and over but without a new snapshot it's not going to try to determine what's different it's always looking at the snapshots never looks at so to speak the active data it always looks at the snapshots to determine if there's a change that needs to be sent over so even though we just deleted a bunch of files which would create change there's not really any change created but let's go ahead and change this up a little bit more real quick and show you what does happen going to go over here tasks replication task we're just going to destroy this replication task delete confirm check out this snapshot still exists we're going to go ahead and get rid of this one too but that does not actually get rid of the snapshots themselves so we go back over to snapshots and we're just going to roll back to this one all right and refresh hey there's all the data packs we roll back to the snapshot now i roll back to it just because i don't need it anymore i'm going to delete this snapshot because we're going to create a new task again we'll add another replication tasks the system again go here same thing we're just going to back up some test data once again don't need to do anything recursive on a different system we've already established last time this ssh connection so we can just reuse it again demo two i'll put a second demo we're doing here and next custom and just for speed we're going to set this to run every minute so if you put the asterisks in for each of these every minute it's going to run this task again done start replication and now let's go ahead and just skip ahead one minute we'll start letting this replication start pushing all the data over now of note if you do something this fast make sure that the first replication which takes the longest because it's only differentials afterwards that you don't set it so it they start overlapping each other it'll give errors if you do sometimes it may be necessary but just an fyi it will error out if you try to put too much data in there and the replication tasks have not if the first one hasn't completed before the next one wants to begin in terms of time because of how long it took the data to get from point a to point b you'll end up with some problems all right the data has been pushed over over here we go and there's the demo two now even though the encryption key is technically the same key because we didn't change the encryption keys from when we did this when we did delete the data set off of the destination server it did purge the keys with it like i said so it doesn't unlock it so we have to kind of go through that same process again in order to unlock that data well actually let's leave it locked because i want to show you how disaster recovery works and how you don't necessarily need to have the data unlocked on the destination system all it's doing is holding data it doesn't actually have to know what it's holding it is just the data so this will run and uh we'll go ahead and actually delete some files again go here and just delete something if you have any data there's some of the data in there so let's purge this real quick eat all that all right we deleted so there's some changes going on wait another minute let it run again all right we waited a few minutes and this is the destination system and we see that a few snapshots have been copied over here's the reference ones here's the changes because we deleted some of the files in the snapshot and so this is just running every minute now what if a catastrophe strikes this is where the question happens of how do i get the data back there's a couple different options so if the catastrophe is oops someone deleted a bunch of files on this particular system ideally you just go back and roll back a snapshot to bring all the data back on the local system but let's go ahead and create a total disaster scenario which unfortunately happens occasionally we're going to go over here and let's delete the data set well first we'll stop this from running so i don't get a bunch of errors when i delete the data set we'll delete the accompanying snapshots and let's go over to storage pools and what's worse than destroying a pool right so let's get rid of the pool export disconnect destroy data on this pool absolutely confirm now it's actually going to get rid of the smb share as well but destroying the pool other than physically destroying the machine uh yeah this is bad all the data on that pool was destroyed all right not ever something you want to see unless you're confident in your backups so let's see how good are we on the backups here go ahead and create a new pool and this pool is going to have its own encryption key but i understand this is going to be encrypted new cool all right just throw some drives in here yeah ready z1 is plenty for this the contents will be erased don't worry we did it in last action so no problems there let's go ahead and create a new pool all right new pool created so we're going to go over here to tasks replication tasks add now this is under some assumption that you backed up all this system and that you have the keys installed and i bring it up because when we say source location on a different system it's going to do the trueness mini if not we can create a new connection if you didn't back any of those up because the data is what's most important source go here and uh what do we call that demo2 there we go and destination new pool restored data from i just restored data is fine i don't play type it anymore it even found six snapshots that's going to pull back over here so we're going to hit next run once i only need this i don't even need to make it read only i just need to grab that data from my backup and pull it back over so ahead and hit start replication replication has started all right replication is finished so let's go over storage look at our pools there's all the restored data and all the restored snapshots this is before i deleted some data this is what happened after so now i don't just have the data i have the revisions of the data that were created by the snapshots but i of course have the new challenge of how do i decrypt it well we're going to go here to unlock and if we were smart and we followed my previous backup video we would have backed all this up so let's go ahead and get that particular data out that's that key we needed this is why backing up data set keys are so important continue and it's decrypted and unlocked now this whole time just for reference this system here still has no idea what's in here it just is a destination it held all the encrypted data we survived by destroying all the data over there and restoring it again and this system still has no idea what's inside of that folder this is one of the nice things about the encryption so you don't have to have the same level of trust so to speak because you're encrypting all the data at rest and there's not necessarily a reason to have it unencrypted you can go back completely restore it and be able to see all the data we'll just shell into the system real quick here mount new pool restore data hey look enroll in some other folder it's all in there restored just like it was because that's well we deleted all those little files we can of course roll it back to a snapshot and have all the other data back if we wanted to but you get the idea this system is still blind to this data it's encrypted at rest and it doesn't have to know anything about it as long as you back up that encryption key which is obviously incredibly important because if you did not this process would have failed at that point we're unable to update the key one little quick piece of errata before we finish this is going to be if you go here to encryption options on this data set that we just restored it really doesn't matter any data set really and we switch it to a passphrase instead of a keyed authentication we're just going to set to one two three four five six seven eight not a great password but if you use this instead and then do the replication this will change because now all you do is have to remember that the passwords one two three four five six seven eight so if you are doing that type of replication when you replicate it over to other server it's just gonna need the passphrase uh so for those of you that aren't using keys and you're using passphrase instead same process everything's the same except you don't worry about the key you just have to remember the passphrase so i want to make sure i leave that in there for those of you using passwords on them so hopefully you understand the better the entire process by which cfs replication works and why it's so imperatively important to back up those encryption keys and also get you some better understanding of that you can have your data resting somewhere else but the encryption key missing means it's not easily accessible alright and thanks and thank you for making it to the end of the video if you like this video please give it a thumbs up if you'd like to see more content from the channel hit the subscribe button and hit the bell icon if you like youtube to notify you when new videos come out if you'd like to hire us head over to lawrences.com fill out our contact page and let us know what we can help you with and what projects you'd like us to work together on if you want to carry on the discussion head over to forums.lawrences.com where we can carry on the discussion about this video other videos or other tech topics in general even suggestions for new videos they're accepted right there on our forums which are free also if you'd like to help the channel in other ways head over to our affiliate page we have a lot of great tech offers for you and once again thanks for watching and see you next time
Info
Channel: Lawrence Systems
Views: 25,501
Rating: undefined out of 5
Keywords: lawrencesystems, zfs replication, zfs replication vs rsync, zfs replication freenas, zfs replication truenas, zfs replication speed, zfs, nas, network attached storage, storage, freenas, zfs (software), TrueNAS zfs replication, truenas, freebsd, open source
Id: XOm9aLqb0x4
Channel Id: undefined
Length: 25min 26sec (1526 seconds)
Published: Mon Feb 01 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.