so I released a video a few months ago on why
Enterprise disks were the go-to choice for home and SMB NAS use cases. and I'll link that below. But in
that video I talked about why I had stopped using Western Digital and was buying Seagate disks.
And this generated a lot of conversation about people's experiences with each drive manufacturer
and there's a lot of polarized opinion about this. It basically came down to people refusing to
either buy Seagate or Western Digital based on prior experience with failures, and a few people
question why I had not considered Toshiba. This was mostly very reasonable but it is clear that
personal experiences heavily influence our opinions on manufacturer's Drive quality which
is probably expected. So today I'm going to go to a large source of disk reliability data and
do some in-depth analysis of drives from the major manufacturers and look at their reliability
and lifespans. The goal is to take any emotional biases out of the equation and look at the story
that the data tells about the drive's reliability. So firstly the source of this data. Backblaze are
a US cloud backup provider who've been around for about 16 years they are interesting, as they not
only monitor and record data about their drive performance; but they publish it publicly, and this
is awesome. And the data provides insight into the reliability of around 345,000 drives that they
have deployed over the last 10 years. 99.65% of these drives are from the four major brands,
namely Seagate, Western Digital Toshiba and HGST. Now HGST was formed out of IBM and Hitachi,
but today it's actually a wholly owned subsidiary of Western Digital and until late 2015 it ran
entirely independently. But as the state of its integration into Western Digital isn't clear from
a parts, process and integration standpoint I'm going to treat them separately for the purpose of
this analysis. And I want to say, by the way; a big thank you to Backblaze for sharing their data and
allowing its use. The detail it contains reveals really a lot about these drives. A quick note on
my methodology so you can understand what the data contains and how I work with it for this analysis.
I took the data which is a little shy of 200 gigabytes and put it into a SQL database, this
produced around 410 million rows of data on around 345,000 disks used over the last 10 and a
half years or so. 99.4% of the data set are spinning disks with a smaller number of M.2 and SSDs which
I will ignore for this specific analysis. This is made up of about 140 disk models of which 87
are consumer models and 40 Enterprise models, and there's four NAS models. The remainder being
laptop disks. Capacities vary from 80 gigabytes at the low end up to 22 terabytes. The standout
capacities with the most deployed disks are 4, 12, 14 and 16 terabytes each with about 60,000 to 100,000
units. The data set only contains 66 18 terabyte disks and four 22 terabyte disks, so as of June
2023 those disks are only really starting to be deployed. The data is made up about 55% Seagate, 19.5%
HGST, 16.7% Toshiba and 8.5% from the WD, with 0.3 from Samsung as branded from before the Seagate
acquisition. I also cut the Samsung data from the data analysis as the data set is too small to
get firm data from it's 1200 disks in total and they are mostly two and a half inch disks. Before
we start, a brief history of the hard disk drive brands so you know where the pedigree, technology
and intellectual property comes from. The hard disk industry has been through decades of acquisition
and consolidation, with just three manufacturers remaining today. And these are Seagate, Western
Digital and Toshiba. they have around 45%, 36% and 18% of market share respectively. You can see
here the history of these acquisitions, and if you've been around for a while you may recognize
names from the past such as Samsung, Hitachi, IBM, Maxtor, Quantum amongst others. According to data
from Tom's Hardware, around 50 million hard drives were shipped in Q1 2023 and this is a significant
reduction of around 36% year on year. But spinning disks are still the go-to for large-scale cheap
storage and development is still occurring. As we will see, reliability on these disks is generally
remarkably good and has been improving over the last few years. So the first Insight is the vendor
deployment over time in the data. This graph shows the number of operational drives, not just added
drives. So as drives are added and removed, the data here shows the net result of that. We will
come to deployment and removal rates later in the video the key here shows HGST in grey, Seagate in green, Toshiba in red; and Western Digital colored blue. Darker colors are the Enterprise drives, with
the lighter being the consumer units. NAS drives are shown as a middle color but the deploy base
is so small that it is invisible here on this graph. But you can see that Backblaze's originally
deployed consumer drives with the bulk from HGST, who are now owned by Western Digital, and from
Seagate. HGST Enterprise Drive started being deployed around March 2014, with Seagate Enterprise
drives from around March 2017. And by 2017, 50% of disks in service were Enterprise disks, which
becomes the dominant deployment choice going forward from that point. We can also see that
Backblaze has shifted suppliers around a little, but Seagate has been a consistent supplier with HGST a preference in early days and then a shift from them towards Toshiba. Later in the data we see
that they started to deploy more WD drives. There were some WD drives deployed back in 2014, but
the numbers were low and they were removed after a few years. There is a lot of data that tells us
why some of this may have happened, but it won't just be reliability it will also be related to
supplier availability terms conditions such as warranty and pricing of course. Backblaze clearly
had a preference for consumer drives at the start which has shifted to enterprise and they have
talked some about this in their published blogs, which is great. HGST drives are still available
today but they do seem hard to get and it looks like WD favors branding drives as Western Digital
and not HGST. And this may explain why deployment of the HGST set has pretty much stopped in the
last year and has shifted towards Western Digital. Now let's dive into the drive deployments so we
can see the manufacturer's capacity and quantities deployed over these 10 years. And this aligns with
what we saw before but it adds four dimensions to the data you see the date along the x-axis
at the bottom with the average Drive size on the y-axis at the side. The colors represent the
suppliers with green being Seagate, and red being Toshiba on the bottom graph. And then grey being
HGST and Western Digital color blue above. I'm going to use these colors throughout when multiple
manufacturers are on the same graph. As the points are using the average drive size it will primarily
be influenced by the major drives being deployed , but can move around a little with some large or
smaller drives are also going into service. We see two large blobs back in 2013 which is the existing
deployment base when the data starts, averaging at about 2.8 gigabytes. We then see primarily HGST
and Seagate four terabyte disks until 2016-2017 and then eight terabyte disks moving to 12 in 2018.
And then gradually increasing up to 14 and then 16 terabytes. The visualization clearly shows the
brand preference during the period, the size of the bubble will show how many are deployed and the
scale is normalized so you can visually compare. For example this large Seagate blob in the middle
is around 7,200 drives in January 2018 with an average capacity of 12 terabytes. We're actually
going to come back to those 12 terabyte drives later. We see how this varies between consumer
and enterprise drives with the consumer drives in grey and the enterprise shown in gold. I think this
graph speaks for itself, but again it shows average capacity and quantity with normalized bubble sizes.
Backblaze has talked about its use of consumer drives, but it's clear that it moved towards almost
exclusive enterprise drive use in 2017. All this provides context about what Backblaze deployed
and operated but let's now look into the failure data as this is what is really interesting
for those trying to understand drive quality and life. And here we see a chart that shows the
timeline along the x-axis and the average life of drives at the time of failure on the y-axis. So
the x-axis here doesn't tell you about the nature of the failures, but when they occurred. You would
expect to see a sloping trend up and to the right, likely with increasing failure numbers as shown
by the bubble sizes. This is because as drives are deployed longer, their lifetime hours will
increase. The failure shown in the bubble sizes are not for a number specifically as the quantity
deployed by each vendor varies greatly and are not comparable. Instead these bubbles represent
the percentage of current in-service disks that fail in each month. Some observations here are
that the percentage failure rates on the blue WD drives are large before 2021, but still spread
between 0 and 50k hours of service. And and this can be explained by the small deploy base. Any
single or small numbers of failures is going to result in a larger percentage. We also see that
HGST in Grey, Seagate in green, and Toshiba in red have comparable failures with Toshiba having a
slightly shallower slope indicating that failures happen a little earlier. But the fail percentages
are also a little lower HGST drives have smaller failure rates which suggest better survivability,
the number of blue colored WD failures are quite small but they are less consistent and more of them
fall in the 0 to 20K hours range which isn't a good indicator. One brand being lower or higher on
the graph doesn't really indicate anything other than the batch of deployed disks was bought online
later, so it has less hours on them. It's really the angle, and the consistency of the trend that is more
interesting, along with the size of the bubbles. Recently more WD drives are being deployed but the
failure percentage looks more and are very scattered, and this is probably because they've
not been in service long enough to develop a firm trend. The other interesting thing we see is
a large collection of failures for Seagate back in the 2013-2015 range and these are consumer drives
that failed before reaching much more than twenty thousand hours. It has been suggested that Backblaze's choice to use consumer drivers on this kind of workload is likely the reason, and they
do seem to have moved towards enterprise drives. However looking in the data we see that consumer
and enterprise drives in the past five years or so seem to have similar survival rates. There is a lot
of data that I will have to come to in a follow-up but this is one thing that is interesting. What is
also possible is that these drives in 2013-2015 time frame were simply not at the same quality
and tended to fail earlier. In 2011 Thailand suffered catastrophic flooding that massively impacted hard
disk manufacture for which a large majority was performed in the country. Seagate, Toshiba and WD
were all heavily focused in Thailand and had to move capacity elsewhere. But the data suggests that
quality has improved since. Those failures would be for drives deployed in the 2011-2013 time frame
and the drive sourcing and pricing may also be a factor that led Backblaze to deploy consumer drives at
that time. You can also see from the scale on the right that despite the size of these bubbles, they
do generally represent very small percentages. For example the very top to the right bubbles here
represent only about a 0.25% failure rate, meaning that for all in-service Seagate drives only about
one in four hundred failed in each month, and they had an average drive power on time of around 46,000
hours which is around five and a quarter years. The absence of bubbles in the bottom right of
the graphs is a good sign for drive reliability with the current deploy base from Seagate
having an average age failure rate above 35,000 hours for the last couple of years. Which is
around four years. I stress though that these are averages and a deeper dive is needed to identify
more granularity, and it is worth noting these are the failure ages for failed drives, there's
obviously a lot of drives, a large percentage that did not fail and continue to run in service.
To draw strong conclusions we need to go a bit deeper and the data tells us that there are some
specific drive models from different vendors that had higher failure rates and these skew the data
significantly. These shouldn't be excluded from the data because it's a problem if models without
adequate QA are mass shipped, but it's worthy of more analysis. What we can infer from what we've
seen so far, is that HGST appeared to have a more robust drive with Seagate following and Toshiba
appearing to have a slightly lower reliability. But it doesn't look like any of the brands
are significantly worse. Some manufacturers had specifically bad models in the data that have
heard their stats and there is the possibility that the drives themselves were actually not
the issue and that it's just that a batch got damaged in transit, but we need to dig deeper to
assess the likelihood of that. The jury may still be out on Western Digital as the historical data
doesn't make them look the best and actually the small deployment of WD NAS Drives had bad failure
stats. It's early for the newer enterprise disk but initial signs do look good. Maybe some of that
HGST pedigree has helped here. And of course pricing, brand reputation etc will all play a
role in your decision, and I talked in the link video below about my recent experiences with
Western Digital and how it affects my decision process. A really strong technical reason to go
to WD might sway me but I don't see it here, at least for now. We're going to check back in
as Backblaze releases more data. There is a lot more in the data that's worthy of exploration
and I will be making some more content on all of this this, includes failures versus decommissions
which reveals some interesting things, as well as the direct comparison of these numbers between
the vendors. I'm also going to be looking at the specific models and failure rates and focusing on
the currently available 10 to 18 terabyte range and if you want to catch that don't forget
to subscribe. The 12 terabyte Seagate drives I mentioned earlier are especially interesting
as the data reveals a specific model that was deployed at scale and then almost completely
removed again due to high failure rates and we're going to dig into that in a follow-up also.
If this was interesting please do hit the thumbs up, it really helps with channel growth and reach
and it tells me that this content is interesting and that I should make more of it. And if you want
to comment on my observations or you have your own to add, please also drop them in the comments.
I'm always interested in constructive feedback and ideas from my viewers. And above all, thank
you for watching and I will see you in the next...