what is Data Scrubbing (and how to enable on Synology)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's talk let's talk about data scrubbing all right I've got a board it's going to help me to you know what let me change my lens out okay that is actually a lot better I'm going to use my chair nice and I've got a pen hold on awesome okay data is complicated um there's actually a lot of different professional organizations that spend a lot of time and invest a lot of money in securing data CU When you're dealing with uh electricity which is how computers work uh there's a lot of variation that can happen I mean it can get quite unexpected so there's big organizations like CERN which is the hold on I wrote it down the European Organization for nuclear research they're actually they're a combination of 23 different European member states so 23 different countries have come together and built this and they're the guys who actually run the largest Hadron Collider or that big old particle accelerator in Europe they have a lot of funding a lot of research one of the most advanced um technological computer organizations that exist when we talk about how much data they're actually dealing with you know when we're talking we talk gigs or terabytes they're actually deal dealing with a petabyte of data every single second so in all of their systems combined they have about a petabyte of information that goes through every single second that is 1,000 terabytes of data they're dealing with a lot of data and when you're dealing with a lot of data you're going to start dealing with bad data sometimes and when you're running things that are scientific critical research like a particle accelerator it's extremely important that whatever data processing that they're using that is highly reliable for what they're doing so why do we care well because in any Nas device or any computer itself it's all prone to data errors we can run into the same kind of data errors that these big organizations that are storing and moving and transmitting a lot of data can run into we can learn a lot from their research and their best practices that they establish that they set up scrubbing is one of the core methods that we use in order to detect and correct error on our Nas devices before those errors become permanent what's really cool what's really cool is just how scrubbing works so let's imagine for a second this whiteboard is our Nas device so so this is going to be our Nas device and in inside our Nas device stored on our hard drives are blocks of data so I'm going to be writing ones and zeros ones and zeros are bits basically they're the base unit of how data is stored in electronic form maybe we have a block of data and that data is we're going to write we're just going to make this up 1 0 1 0 okay so this should be a block of data that we have on our hard drive and I'm going to call it data block a okay so this is going to be data block a it's one one0 now maybe we have a second data block we have a second hard drive we're going to name that data Block B and in that data Block B we've got 1 0 0 okay so you could see the data in Block a and the data in Block B is slightly different and we want to think to ourselves how can we compare these two pieces of data in order to recognize and fix a data error so that's what a data error is all it is is it's one of these bits of data being switched from a one to a zero or a 0 to one when you create a raid array that's raid five raid six or if you're using a Synology NZ they have that special shr raid remember that I've said in a previous video that you're creating fault tolerance for a drive so for example in raid five if one of your drives goes bad you can pull out that drive insert a new drive and it'll rebuild that data on that raid we use a special function called exor to create what's called a parody block of data we're going to call this block right here par block P so to build our parody block P so that we can understand what's in block a and Block B without having to have an exact copy of block a Block B we're going to perform a mathematical function called X or so we're going to do a X or and the symbol for x or is a plus with a circle around it okay so that's the symbol for xor we're going to do a xor b xor is a very simple ma mathematical function all it does is it Compares every single one of these bits and it says are they the same or are they different zero for exor function means matching one means different so zero means they match one means they're different so it's very simple the first bit of A and B are they the same or are they different are they matching or are they different well one and one so they're matching so we're going to write a zero now we're going to do the same thing 0 and one are they matching or are they different well zero and one are different so we're going to do a one how about 1 and zero again they're different so we're going to do another one and then our final bit in block a and Block B we're going to exort 0 and zero those are the same so we're going to do a zero now we need to understand how scrubbing works so what we're going to be doing now that we have our two data blocks and our par block we're going to be comparing these functions in order to understand whether a block is bad or not all we do now is we do can I raise this I'm running out of whiteboard space a xor b which we did xor P so in order to discover if our data in data block A and B are good or bad we're going to be doing this following function a X or B X or P so maybe in our illustration I got some tabs to help us this is what it should be one 1 0 but maybe some kind of data error occurs and instead of one 1 0 we have data degradation and it turns our bit flips and it turns into 11 one0 so what we're going to do is we're going to use our Par Drive and this a xorb xor p function to detect that there's an error when we scrub so we're now scrubbing our drives and we're looking for errors so the first part of this function we're going to do now one and one are matching so we're going to do a zero one and one are matching so we're going to do another zero one and zero are different so we're going to do a one and zero and zero are matching so we're going to do is zero now we're going to compare 0 0 1 0 x or our parody block 0 and 0o are matching so we're going to do zero one and zero are not matching so we're going to do one one and one are matching so we're going to do another zero and 0 and zero are matching so we're going to do another zero so the result of a xor b xor p our parity block is 0 1 0 0 anytime we scrub our drives and we perform this parity function anytime we get a resulting one in our answer that means we have now detected an error in our block how cool is that so we can use math our xw function and our parity Block in order to detect if there's errors in our data not only can we detect using the xare function if there's an error on our data but we can actually repair that data using another xare function xare is so cool I mean we are really having a fun time here the way that we would repair data block a now that we've detected that it's an error on data block a we use the function B xor our parody block I'm not going to write it down but let's do it in our head and let's figure out what that would be okay so B X or P what is that 1 and zero are different so that's going to be a one one and one are matching so that's going to be a zero so we've just fixed that error zero and one are different so we're going to do a one and 0o and zero are matching so we're going to do a zero how cool is that not only are we able to detect where the error might occur but we're also able to fix or heal those errors on our drive now one of the questions you might have if you're really kind of thinking ahead of me here if you're really thinking about this when we do a X or B X or P how would we know if it was in block a or Block B if the data is bad using this function and the truth is well we don't well now we drill down into Synology and btrfs and the advantage that btrfs provides btrfs uses a check sum called crc32c that creates a fingerprint of the file that we're looking at and so if the file changes if any of the parts of the file changes that check suum value it's going to change what btrfs does it builds all the check suum of all our files and then it'll check the check sum of every single file and so if it sees that the check sum should be one way but the check sum actually returns something different arseny NZ can identify when files on our hard drives are going bad and this all works when we're data scrubbing our drives so it's really really cool that it can detect errors in our data and fix them before the data errors become permanent I want to show you exactly how you enable data scrubbing if you're using a syy NZ and how to enable Smart reporting now I've logged into my sonology Nas and I've got file station open just to show you I've already started adding my YouTube videos in order to save them on my Soni Nas but I haven't done anything really other than that so let's go to the main menu let's go to our control panel first because we want to check that we can enable scrubbing let's go to the shared folder click on the one you want to check go to edit and then go to Advanced and you want to make sure that this box right here enable data checkm for Advanced Data Integrity is checked for existing shared folders you can't change this setting you can only change this setting either check or uncheck it when you first create your shared folder if it's not checked and you want to enable scrubbing you're going to need to create a new shared folder and then you're going to need to move all of your existing data from your old sh shared folder to the new one that you just created with that setting check now we want to go ahead and enable file scrubbing so we're going to go back to our main menu we're going to go to our storage manager this time we're going to click on our storage pool and then we're going to click on schedule data scrubbing just check this box enable data scrubbing and then we're going to choose the storage pool that we want to enable for frequency I'm in this video going to suggest to you the best practice for frequency in scrubbing and this is going to cover most of the users who are watching this video and that's to repeat scrubbing monthly under frequency you can also set how often you want to run your data scrubbing data scrubbing reads all the data on your drive so it's going to go through all the data and it's going to be reading all that data it can be very very intensive for the hard drives which means that you're going to get a performance hit if you're trying to do something with the SN NZ and scrub the data at the same time whenever I run with like a business I'll make sure to say I only want to run the Soni Nas data scrubbing when the business isn't currently operating so I might do it say overnight so I would skip business hours which might be 8 to 5 um Monday through Friday so this is usually the schedule I go to that I like um obviously for you just set it with what ever you think and then you're just going to say okay so we're going to go ahead and save this schedule and now for me my scrubbing is going to run once a month and this is exactly what it looks like so anytime your Nas is running scrubbing you're going to have this little icon in the top and if you click on it you can see the information about the data running the second thing we want to do is enable Smart reporting so if data scrubbing works for uh making sure that the data on your sonology NZ is is good smart scrubbing is going to look at the physical conditions of the drive and make sure they're good smart date is kind of like a canary and a coal mine it doesn't always detect when the hard drive is going bad but it can often detect early detect when the hard drive is going bad so in order to do that we're going to go and click on hard dis drive versus uh SL SSD we're going to click right here we're going to go to settings and then right here we're going to go to test schedule we want to set up two individual smart task tests we want to do quick tests and extended tests so under task name I'm going to call this smart Quick Test I can go to schedule now under our schedule I'm going to want to run quick tests weekly so I'm actually going to select run on the following days and I'm going to change this to Sundays now under time this is not nearly as much of a performance hit as it is as scrubbing is so you may or may not want to change this 12 you know midnight might be fine for you for me I like to set it in the early morning so I'll do something like 4 that way I can make sure that I'm running these tests when nobody's using it and then we're just going to say okay there we go so we've got our quick test now we need to do our extended tests so create another task call this smart extended test now we're going to select the extended test this time and again test all supported drives is fine we're going to go to schedule and this one we're going to be running once a month so we can do repeat monthly it's already selected and then for first time I'm going to set it so if I have my quick tests running every Sunday at 4: I'm going to be running my extended once a month at 5 that way the quick test and the extended test don't overlap each other we're going to say okay and then we're going to hit apply so now we have our smart test enabled all right cool man that's it we've got all the testing we needed enabled on our sonology Nas this is probably one of the most important steps that you could take to make sure that the data on your sonology NZ is good over time just how important scrubbing is um to maintain the Integrity of the data that you have thank you very much for watching this video I hope that you had a good time I can't wait to see you in the next one and until then take care
Info
Channel: Nick Talks Tech
Views: 435
Rating: undefined out of 5
Keywords: data protection, data reliability, data scrub, data scrub basics, disk error, drive maintenance, error prevention, how does data scrubbing work, how does nas scrub work, how does scrubbing work, how to enable scrubbing synology, prevent data loss, raid maintenance, storage maintenance, storage management, synology enable scrubbing, what is data scrubbing, what is nas data scrub, what is nas scrub, what is scrubbing
Id: pk7UH3n0NiM
Channel Id: undefined
Length: 17min 23sec (1043 seconds)
Published: Fri Jun 14 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.