I usually welcome discussion (and even argument) about the things I know best: There is always more to learn, and the best insights come through engaging those who disagree with us. But some ideas have been argued so well for so long that they deserve enshrinement. For example, although non-scientists like to argue about evolution and climate change, the scientific community no longer feels that their theories in these areas require much discussion. Like gravity and relativity, they have been accepted as a foundation upon which to build more interesting hypotheses.
My field of enterprise storage has its share of generally-accepted theories:
- Availability, backup, and archive form a Data Protection Trinity: They are unique requirements calling for focused solutions.
- The Rule of RAID: Combining multiple disk drives in creative ways allows us to change the inherent reliability and performance of the system.
- When it comes to storage management, Homogeneity is Paramount: A single storage administrator can manage thousands of identical systems but would be hard-pressed to support a half-dozen unique ones.
- The entire history of computing demonstrates that Connectivity Trumps Capacity when sizing systems: Performance bottlenecks always limit the scalability of storage systems.
Each of these theories underpins the our industy’s daily routine of storing and retrieving the data that drives modern society. These storage theories are also targets for innovation, with the best minds constantly trying to bend or break them.
This album of storage theories also has a B-side, however. These are the no-longer-true theories that have been transcended, as well as the dubious beliefs that were never really true.
- Commutability of Management and Cost is highly suspect: Unless one is considering only identical and homogenous systems, the total cost of ownership (TCO) or number of administrators associated with a given system (TB/admin) cannot be compared between environments.
- The Price of Parity: The impact of parity calculations and multi-disk commits used to kill write performance, giving RAID-5 a bad name. But write-back caches and array intelligence have all but eliminated this “write penalty” for modern enterprise systems.
- Whenever the high cost of enterprise storage is to be refuted, someone is bound to trot out The Dumb Disk Fallacy, claiming that per-GB array costs ought to be comparable to the price of a bare disk drive. But the value of enterprise storage has always been greater than the sum of its parts.
Over the next few weeks, I will be sharing focused articles about these “holy cows” of the enterprise storage world. I encourage everyone in the industry to join me in taking a step back and shining some light on these and other truisms. Which do you agree with or dispute? Are there other theories that I have overlooked?
© sfoskett for Stephen Foskett, Pack Rat, 2009. |
We Hold These (Storage) Truths…
This post was categorized as Computer history, Enterprise storage, Everything, Gestalt IT. Each of my categories has its own feed if you’d like to filter out or focus on posts like this.
Hi Stephen,
I have to argue the Price of Parity point. I agree that with a large enough cache etc can negate this penalty, but only in underutilised arrays.
E.g. 512G of cache might seem a lot, but when you consider how many front end ports these arrays have and how many hosts people attach to each, and then the disk capacities on the back end etc the cache doesnt seem all that large any more. And dare I say more often than not these arrays are overloaded and the benefits of cache are lost….
I know lots of people that would like to build storage pools for wide striping on top of RAID6 RAID groups, but won't because they know that the write penalty for RAID6 is alive and well – despite what the vendors tell them.
Granted its not as bad as it used to be, but I dont think it has gone away, at least not on all arrays.
Just my thoughts
Hi Stephen,
I have to argue the Price of Parity point. I agree that with a large enough cache etc can negate this penalty, but only in underutilised arrays.
E.g. 512G of cache might seem a lot, but when you consider how many front end ports these arrays have and how many hosts people attach to each, and then the disk capacities on the back end etc the cache doesnt seem all that large any more. And dare I say more often than not these arrays are overloaded and the benefits of cache are lost….
I know lots of people that would like to build storage pools for wide striping on top of RAID6 RAID groups, but won't because they know that the write penalty for RAID6 is alive and well – despite what the vendors tell them.
Granted its not as bad as it used to be, but I dont think it has gone away, at least not on all arrays.
Just my thoughts