Comparing Seagate vs Western Digital (WD), Toshiba and HGST hard disk failure rates and lifespans.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so I released a video a few months ago on why  Enterprise disks were the go-to choice for home and SMB NAS use cases. and I'll link that below. But in  that video I talked about why I had stopped using   Western Digital and was buying Seagate disks.  And this generated a lot of conversation about   people's experiences with each drive manufacturer  and there's a lot of polarized opinion about this.  It basically came down to people refusing to  either buy Seagate or Western Digital based on   prior experience with failures, and a few people  question why I had not considered Toshiba. This   was mostly very reasonable but it is clear that  personal experiences heavily influence our   opinions on manufacturer's Drive quality which  is probably expected. So today I'm going to go   to a large source of disk reliability data and  do some in-depth analysis of drives from the   major manufacturers and look at their reliability  and lifespans. The goal is to take any emotional   biases out of the equation and look at the story  that the data tells about the drive's reliability.   So firstly the source of this data. Backblaze are  a US cloud backup provider who've been around for   about 16 years they are interesting, as they not  only monitor and record data about their drive   performance; but they publish it publicly, and this  is awesome. And the data provides insight into the   reliability of around 345,000 drives that they  have deployed over the last 10 years. 99.65% of   these drives are from the four major brands,  namely Seagate, Western Digital Toshiba and   HGST. Now HGST was formed out of IBM and Hitachi,  but today it's actually a wholly owned subsidiary   of Western Digital and until late 2015 it ran  entirely independently. But as the state of its   integration into Western Digital isn't clear from  a parts, process and integration standpoint I'm   going to treat them separately for the purpose of  this analysis. And I want to say, by the way; a big   thank you to Backblaze for sharing their data and  allowing its use. The detail it contains reveals   really a lot about these drives. A quick note on  my methodology so you can understand what the data   contains and how I work with it for this analysis.  I took the data which is a little shy of 200   gigabytes and put it into a SQL database, this  produced around 410 million rows of data on   around 345,000 disks used over the last 10 and a  half years or so. 99.4% of the data set are spinning   disks with a smaller number of M.2 and SSDs which  I will ignore for this specific analysis. This is   made up of about 140 disk models of which 87  are consumer models and 40 Enterprise models,   and there's four NAS models. The remainder being  laptop disks. Capacities vary from 80 gigabytes   at the low end up to 22 terabytes. The standout  capacities with the most deployed disks are 4, 12,   14 and 16 terabytes each with about 60,000 to 100,000  units. The data set only contains 66 18 terabyte   disks and four 22 terabyte disks, so as of June  2023 those disks are only really starting to be   deployed. The data is made up about 55% Seagate, 19.5%  HGST, 16.7% Toshiba and 8.5% from the WD, with 0.3   from Samsung as branded from before the Seagate  acquisition. I also cut the Samsung data from the   data analysis as the data set is too small to  get firm data from it's 1200 disks in total and   they are mostly two and a half inch disks. Before  we start, a brief history of the hard disk drive   brands so you know where the pedigree, technology  and intellectual property comes from. The hard disk   industry has been through decades of acquisition  and consolidation, with just three manufacturers   remaining today. And these are Seagate, Western  Digital and Toshiba. they have around 45%, 36% and   18% of market share respectively. You can see  here the history of these acquisitions, and if   you've been around for a while you may recognize  names from the past such as Samsung, Hitachi, IBM, Maxtor, Quantum amongst others. According to data  from Tom's Hardware, around 50 million hard drives   were shipped in Q1 2023 and this is a significant  reduction of around 36% year on year. But spinning   disks are still the go-to for large-scale cheap  storage and development is still occurring. As we   will see, reliability on these disks is generally  remarkably good and has been improving over the   last few years. So the first Insight is the vendor  deployment over time in the data. This graph shows   the number of operational drives, not just added  drives. So as drives are added and removed, the data   here shows the net result of that. We will  come to deployment and removal rates later in the   video the key here shows HGST in grey, Seagate in green, Toshiba in red; and Western Digital colored   blue. Darker colors are the Enterprise drives, with  the lighter being the consumer units. NAS drives   are shown as a middle color but the deploy base  is so small that it is invisible here on this   graph. But you can see that Backblaze's originally  deployed consumer drives with the bulk from HGST,   who are now owned by Western Digital, and from  Seagate. HGST Enterprise Drive started being   deployed around March 2014, with Seagate Enterprise  drives from around March 2017. And by 2017, 50% of   disks in service were Enterprise disks, which  becomes the dominant deployment choice going   forward from that point. We can also see that  Backblaze has shifted suppliers around a little,   but Seagate has been a consistent supplier with HGST a preference in early days and then a shift   from them towards Toshiba. Later in the data we see  that they started to deploy more WD drives. There   were some WD drives deployed back in 2014, but  the numbers were low and they were removed after   a few years. There is a lot of data that tells us  why some of this may have happened, but it won't   just be reliability it will also be related to  supplier availability terms conditions such as   warranty and pricing of course. Backblaze clearly  had a preference for consumer drives at the start   which has shifted to enterprise and they have  talked some about this in their published blogs,   which is great. HGST drives are still available  today but they do seem hard to get and it looks   like WD favors branding drives as Western Digital  and not HGST. And this may explain why deployment   of the HGST set has pretty much stopped in the  last year and has shifted towards Western Digital.   Now let's dive into the drive deployments so we  can see the manufacturer's capacity and quantities   deployed over these 10 years. And this aligns with  what we saw before but it adds four dimensions to   the data you see the date along the x-axis  at the bottom with the average Drive size on   the y-axis at the side. The colors represent the  suppliers with green being Seagate, and red being   Toshiba on the bottom graph. And then grey being  HGST and Western Digital color blue above. I'm   going to use these colors throughout when multiple  manufacturers are on the same graph. As the points   are using the average drive size it will primarily  be influenced by the major drives being deployed  , but can move around a little with some large or  smaller drives are also going into service. We see   two large blobs back in 2013 which is the existing  deployment base when the data starts, averaging at   about 2.8 gigabytes. We then see primarily HGST  and Seagate four terabyte disks until 2016-2017   and then eight terabyte disks moving to 12 in 2018.  And then gradually increasing up to 14 and then 16   terabytes. The visualization clearly shows the  brand preference during the period, the size of   the bubble will show how many are deployed and the  scale is normalized so you can visually compare.   For example this large Seagate blob in the middle  is around 7,200 drives in January 2018 with an   average capacity of 12 terabytes. We're actually  going to come back to those 12 terabyte drives   later. We see how this varies between consumer  and enterprise drives with the consumer drives in   grey and the enterprise shown in gold. I think this  graph speaks for itself, but again it shows average   capacity and quantity with normalized bubble sizes.  Backblaze has talked about its use of consumer   drives, but it's clear that it moved towards almost  exclusive enterprise drive use in 2017. All   this provides context about what Backblaze deployed  and operated but let's now look into the failure   data as this is what is really interesting  for those trying to understand drive quality   and life. And here we see a chart that shows the  timeline along the x-axis and the average life   of drives at the time of failure on the y-axis. So  the x-axis here doesn't tell you about the nature   of the failures, but when they occurred. You would  expect to see a sloping trend up and to the right,   likely with increasing failure numbers as shown  by the bubble sizes. This is because as drives   are deployed longer, their lifetime hours will  increase. The failure shown in the bubble sizes are   not for a number specifically as the quantity  deployed by each vendor varies greatly and are   not comparable. Instead these bubbles represent  the percentage of current in-service disks that   fail in each month. Some observations here are  that the percentage failure rates on the blue   WD drives are large before 2021, but still spread  between 0 and 50k hours of service. And and this   can be explained by the small deploy base. Any  single or small numbers of failures is going to   result in a larger percentage. We also see that  HGST in Grey, Seagate in green, and Toshiba in   red have comparable failures with Toshiba having a  slightly shallower slope indicating that failures   happen a little earlier. But the fail percentages  are also a little lower HGST drives have smaller   failure rates which suggest better survivability,  the number of blue colored WD failures are quite small   but they are less consistent and more of them  fall in the 0 to 20K hours range which isn't a   good indicator. One brand being lower or higher on  the graph doesn't really indicate anything other   than the batch of deployed disks was bought online  later, so it has less hours on them. It's really the   angle, and the consistency of the trend that is more  interesting, along with the size of the bubbles.   Recently more WD drives are being deployed but the  failure percentage looks more and are very   scattered, and this is probably because they've  not been in service long enough to develop a   firm trend. The other interesting thing we see is  a large collection of failures for Seagate back in   the 2013-2015 range and these are consumer drives  that failed before reaching much more than twenty   thousand hours. It has been suggested that Backblaze's choice to use consumer drivers on this   kind of workload is likely the reason, and they  do seem to have moved towards enterprise drives.   However looking in the data we see that consumer  and enterprise drives in the past five years or so   seem to have similar survival rates. There is a lot  of data that I will have to come to in a follow-up   but this is one thing that is interesting. What is  also possible is that these drives in 2013-2015   time frame were simply not at the same quality  and tended to fail earlier. In 2011 Thailand suffered   catastrophic flooding that massively impacted hard  disk manufacture for which a large majority was   performed in the country. Seagate, Toshiba and WD  were all heavily focused in Thailand and had to   move capacity elsewhere. But the data suggests that  quality has improved since. Those failures would be   for drives deployed in the 2011-2013 time frame  and the drive sourcing and pricing may also be a   factor that led Backblaze to deploy consumer drives at  that time. You can also see from the scale on the   right that despite the size of these bubbles, they  do generally represent very small percentages. For   example the very top to the right bubbles here  represent only about a 0.25% failure rate, meaning   that for all in-service Seagate drives only about  one in four hundred failed in each month, and they had   an average drive power on time of around 46,000  hours which is around five and a quarter years.   The absence of bubbles in the bottom right of  the graphs is a good sign for drive reliability   with the current deploy base from Seagate  having an average age failure rate above 35,000  hours for the last couple of years. Which is  around four years. I stress though that these are   averages and a deeper dive is needed to identify  more granularity, and it is worth noting these   are the failure ages for failed drives, there's  obviously a lot of drives, a large percentage   that did not fail and continue to run in service.  To draw strong conclusions we need to go a bit   deeper and the data tells us that there are some  specific drive models from different vendors that   had higher failure rates and these skew the data  significantly. These shouldn't be excluded from   the data because it's a problem if models without  adequate QA are mass shipped, but it's worthy of   more analysis. What we can infer from what we've  seen so far, is that HGST appeared to have a more   robust drive with Seagate following and Toshiba  appearing to have a slightly lower reliability.   But it doesn't look like any of the brands  are significantly worse. Some manufacturers   had specifically bad models in the data that have  heard their stats and there is the possibility   that the drives themselves were actually not  the issue and that it's just that a batch got   damaged in transit, but we need to dig deeper to  assess the likelihood of that. The jury may still   be out on Western Digital as the historical data  doesn't make them look the best and actually the   small deployment of WD NAS Drives had bad failure  stats. It's early for the newer enterprise disk but   initial signs do look good. Maybe some of that  HGST pedigree has helped here. And of course   pricing, brand reputation etc will all play a  role in your decision, and I talked in the link   video below about my recent experiences with  Western Digital and how it affects my decision   process. A really strong technical reason to go  to WD might sway me but I don't see it here, at   least for now. We're going to check back in  as Backblaze releases more data. There is a   lot more in the data that's worthy of exploration  and I will be making some more content on all of   this this, includes failures versus decommissions  which reveals some interesting things, as well as   the direct comparison of these numbers between  the vendors. I'm also going to be looking at the   specific models and failure rates and focusing on  the currently available 10 to 18 terabyte range   and if you want to catch that don't forget  to subscribe. The 12 terabyte Seagate drives   I mentioned earlier are especially interesting  as the data reveals a specific model that was   deployed at scale and then almost completely  removed again due to high failure rates and   we're going to dig into that in a follow-up also.  If this was interesting please do hit the thumbs   up, it really helps with channel growth and reach  and it tells me that this content is interesting   and that I should make more of it. And if you want  to comment on my observations or you have your   own to add, please also drop them in the comments.  I'm always interested in constructive feedback   and ideas from my viewers. And above all, thank  you for watching and I will see you in the next...
Info
Channel: SomeTechGuy
Views: 32,789
Rating: undefined out of 5
Keywords: Seagate, Western Digital, HGST, Toshiba, Enterprise Disk, NAS Disk, Consumer Disk, NAS, HDD, hard disk, hard drive, spinning disks, disk failures, disk crash, SMART, backblaze, Storage, storage, PC build, disks, spinning rust, synology, qnap, unraid, freenas, RAID, best hdd, maxtor, quantum, ibm, samsung, hitachi, fujitsu, lacie, analysis, drive failure, head crash, data loss, disk recovery, Exos, Ironwolf, WD Red, Red Pro, barracuda, spinpoint, WD blue, best hard disk, best hard drive
Id: ipBVdCAJ9AY
Channel Id: undefined
Length: 14min 24sec (864 seconds)
Published: Fri Sep 15 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.