I've been following this report for many years, but Backblaze, as a backup service (traditionally), has very different IO patterns than many users. They originally started with consumer drives, which we found to be far too unreliable. In my experience, the BER and write cycles have a dramatic impact on overall drive performance. The MTBF declines sharply as write cycles increase, both as a percentage of IO and overall IO.
Backblaze changed IO patterns with B2, but that would be the key data for me to make this more useful: failure rate as a percentage of bytes read/written, etc.
While I find this data interesting it isnt usually very actionable.
The skus with the lowest number immediately get bought out(if they are still available, which they are not always) and will never be available. You also always run the risk of "getting a bad batch" or just getting some drives that got beat up in shipping.
Usually this data is only useful for keeping an eye on your own stuff and prioritizing replacements when the time comes.
When buying drives I just look at the sizes I need and the performance then get 1/3rd from each of the manufacturers.
Yeah, usually by the time you know a specific model is or isn't "good" the mfg has changed production or how things are laid out in the products themselves. Over time, you can glean that some mfg have been better or worse overall than others though, but that's not a promise of future efforts.
All the same, it's definitely cool and interesting to see. I've had some good and some very bad luck with storage drives over the years. I still think twice about Seagate drives since I had 6 out of 8 of their 3tb enterprise models go bad relatively quickly a decade and a half ago, specifically bought through separate vendors. I also had the first IBM Deskstar drives, the second died before the first could be RMA'd (raid1 isn't backup).
While it's tough if you want new drives, I've found I could frequently get used drives on eBay that have significant history on Backblaze's report. Despite the increased risk from used drives, I've found I still end up more reliable than buying random new drives.
I'm mainly looking at manufacturer and model failure rates in aggregate over a period of time like 6 months to determine my next purchases. As you pointed out SKUs with the lowest get slurped up and you always run the risk of bad batches.
My takeaway... The specific model plays a huge role in the failure rate.
A great model has a MTBF of 250 years.
A bad model might have a MTBF of just 5 years.
I suspect if you had a need for reliable storage which couldn't be met with the usual RAID approach, buying 2nd hand drives from eBay of a model and batch proven to be really reliable is probably your best bet.
And to answer the obvious question... One usecase where you want reliability and can't use RAID is where you are selling a product that only has physical space or money for one drive - for example a standalone CCTV storage device.
Every drive failure will lead to an unhappy customer and product return, so you really want the failure rate in the first 10 years of operation to be 1% or below. (Which none of the drives in this study can do).
What Backblaze is doing here is so underrated. This a large scale, practical, in-datacenter real data on essential hardware infrastructure that is available almost nowhere else, and they provide it, and their excellent analysis, completely for free.
I miss this culture and I admire leadership that allows it to not only exist, but thrive. I fear the day a stockholder meeting occurs and someone wringing their hands see the decommissioned pennies they can save by limiting or stopping these reports.
This is the main reason I use them for their S3 compatible storage service over their competitors. While its not enterprise level revenue, I still like to think it makes a difference.
For as long as Backblaze has been doing this and at this level of quality, I have no doubt that these reports are good for business.
(As an anecdotal example -- I first heard about Backblaze from these reports many years ago and have relied on them to an extent in selecting new drives. I'm now a Backblaze customer.)
> I fear the day a stockholder meeting occurs and someone wringing their hands see the decommissioned pennies they can save by limiting or stopping these reports.
The Backblaze stock has taken a beating over the years. Recently I saw some news that there were issues with financial reporting (and fraud?). So it’s anybody’s guess as to what may happen or if the company would even be around (as it exists now) in the next decade.
I’d guess they may already have tools in place to prepare the stats and charts, leaving some amount of writing as manual work (which could or would probably be offloaded to generative AI). But analyzing the reliability of drives and publishing the data could also be seen as a competitive advantage when comparing with newer companies (positive and negative).
is there any danger this data is biased?
Everything good gets corrupted eventually (amazon reviews, consumer reports, ..). is it possible they get some kickbacks for positive reviews ?
HDDs have also been under pressure ... There was barely a month ago a article here, from somebody who setup a cluster of like half a million, with several 1000's hdds. Just to store data for AI training.
Not even two days ago there as a article of backlog on HDDs for AI. Because everybody and their grandmother wants to store the entire internet, out of fear that AI scraping will become more difficult. Aka, they are gating data. And yes, you can train AI easily on HDD even with their lower IOPS. The fact that you got a few 1000 in parallel does the trick, and its often bandwidth issues that hit harder.
I just stockpiled a few extra 4TB NVME because i learned my lesson. NVME has not been dropping in prices after the manufacture pushed it up, and AI is going to keep eating NVME storage for a long time. Let alone HDD storage...
Welcome to the new normal ... Crypto miners killing GPU prices, HDD Crypto miners, Crypto miners again back with a vengeance, O pandemic, everybody needs hardware... Short time of benefits because of over production (on NVME especially, manufacture cut back production) AAAAND .. here comes AI.
In my neck of the woods (HK), HDD price pretty much doubled in the last 2 months. I bought 22TB Toshiba 1 year ago at 30% less than what they cost now.
I've been following this report for many years, but Backblaze, as a backup service (traditionally), has very different IO patterns than many users. They originally started with consumer drives, which we found to be far too unreliable. In my experience, the BER and write cycles have a dramatic impact on overall drive performance. The MTBF declines sharply as write cycles increase, both as a percentage of IO and overall IO.
Backblaze changed IO patterns with B2, but that would be the key data for me to make this more useful: failure rate as a percentage of bytes read/written, etc.
While I find this data interesting it isnt usually very actionable.
The skus with the lowest number immediately get bought out(if they are still available, which they are not always) and will never be available. You also always run the risk of "getting a bad batch" or just getting some drives that got beat up in shipping.
Usually this data is only useful for keeping an eye on your own stuff and prioritizing replacements when the time comes.
When buying drives I just look at the sizes I need and the performance then get 1/3rd from each of the manufacturers.
Yeah, usually by the time you know a specific model is or isn't "good" the mfg has changed production or how things are laid out in the products themselves. Over time, you can glean that some mfg have been better or worse overall than others though, but that's not a promise of future efforts.
All the same, it's definitely cool and interesting to see. I've had some good and some very bad luck with storage drives over the years. I still think twice about Seagate drives since I had 6 out of 8 of their 3tb enterprise models go bad relatively quickly a decade and a half ago, specifically bought through separate vendors. I also had the first IBM Deskstar drives, the second died before the first could be RMA'd (raid1 isn't backup).
While it's tough if you want new drives, I've found I could frequently get used drives on eBay that have significant history on Backblaze's report. Despite the increased risk from used drives, I've found I still end up more reliable than buying random new drives.
I'm mainly looking at manufacturer and model failure rates in aggregate over a period of time like 6 months to determine my next purchases. As you pointed out SKUs with the lowest get slurped up and you always run the risk of bad batches.
My takeaway... The specific model plays a huge role in the failure rate.
A great model has a MTBF of 250 years.
A bad model might have a MTBF of just 5 years.
I suspect if you had a need for reliable storage which couldn't be met with the usual RAID approach, buying 2nd hand drives from eBay of a model and batch proven to be really reliable is probably your best bet.
And to answer the obvious question... One usecase where you want reliability and can't use RAID is where you are selling a product that only has physical space or money for one drive - for example a standalone CCTV storage device.
Every drive failure will lead to an unhappy customer and product return, so you really want the failure rate in the first 10 years of operation to be 1% or below. (Which none of the drives in this study can do).
What Backblaze is doing here is so underrated. This a large scale, practical, in-datacenter real data on essential hardware infrastructure that is available almost nowhere else, and they provide it, and their excellent analysis, completely for free.
I miss this culture and I admire leadership that allows it to not only exist, but thrive. I fear the day a stockholder meeting occurs and someone wringing their hands see the decommissioned pennies they can save by limiting or stopping these reports.
What it buys is long-term good will. Engineers will see they know their stuff and suggest them as a solution for projects and people.
That said, all it would take is for the wrong leadership to start cutting corners to undo all of this hard work.
Backblaze stuck my email on a list, and now I get daily marketing spam from them. They shattered that good will with me very quickly.
This is the main reason I use them for their S3 compatible storage service over their competitors. While its not enterprise level revenue, I still like to think it makes a difference.
For as long as Backblaze has been doing this and at this level of quality, I have no doubt that these reports are good for business.
(As an anecdotal example -- I first heard about Backblaze from these reports many years ago and have relied on them to an extent in selecting new drives. I'm now a Backblaze customer.)
> by limiting or stopping these reports
Hopefully not, given the performance one was just newly added!
> I fear the day a stockholder meeting occurs and someone wringing their hands see the decommissioned pennies they can save by limiting or stopping these reports.
The Backblaze stock has taken a beating over the years. Recently I saw some news that there were issues with financial reporting (and fraud?). So it’s anybody’s guess as to what may happen or if the company would even be around (as it exists now) in the next decade.
I’d guess they may already have tools in place to prepare the stats and charts, leaving some amount of writing as manual work (which could or would probably be offloaded to generative AI). But analyzing the reliability of drives and publishing the data could also be seen as a competitive advantage when comparing with newer companies (positive and negative).
is there any danger this data is biased? Everything good gets corrupted eventually (amazon reviews, consumer reports, ..). is it possible they get some kickbacks for positive reviews ?
It's always possible. But I haven't seen anything that would imply this to be the case so far in all the years I've been reading this.
Given the upcoming 2 year enterprise data shortage coming up due to hyperscalers, I'm curious how this will affect Backblaze.
That is SSD/Memory.
These are HDDs.
HDDs have also been under pressure ... There was barely a month ago a article here, from somebody who setup a cluster of like half a million, with several 1000's hdds. Just to store data for AI training.
Not even two days ago there as a article of backlog on HDDs for AI. Because everybody and their grandmother wants to store the entire internet, out of fear that AI scraping will become more difficult. Aka, they are gating data. And yes, you can train AI easily on HDD even with their lower IOPS. The fact that you got a few 1000 in parallel does the trick, and its often bandwidth issues that hit harder.
I just stockpiled a few extra 4TB NVME because i learned my lesson. NVME has not been dropping in prices after the manufacture pushed it up, and AI is going to keep eating NVME storage for a long time. Let alone HDD storage...
Welcome to the new normal ... Crypto miners killing GPU prices, HDD Crypto miners, Crypto miners again back with a vengeance, O pandemic, everybody needs hardware... Short time of benefits because of over production (on NVME especially, manufacture cut back production) AAAAND .. here comes AI.
Its something every fying year.
In my neck of the woods (HK), HDD price pretty much doubled in the last 2 months. I bought 22TB Toshiba 1 year ago at 30% less than what they cost now.
Shortage -> Glut