Since the financial crash of 2008, the predominant IT mantra has been to do “more with less” – perhaps it was said well before that too, because IT (and in particular storage) is on a growth curve that isn’t likely to stop any time soon. Consider this worrying statistic – between 2014 and 2020, the number of gigabytes deployed per IT professional will grow by more than a factor of 5, while the number of IT pros increases by only 28% (Source https://www.emc.com/leadership/digital-universe/2014iview/business-imperatives.htm). Remember this is all IT staff, not just those managing storage. We’ve seen significant growth in the capacity per TB that storage admins have been expected to manage over recent years.
Henry Ford once said “I invented nothing new. I simply assembled the discoveries of other men behind whom were centuries of work”. Storage administrators may want to look on Ford’s reflection as it has practical application in managing today’s storage growth issues.
Storage As A Pet
Looking back over my career, I remember a time when every disk drive had to be lovingly attended, like a favourite pet, to use the well-known pets/cattle analogy for web-scale computing. This if course was a time before RAID was established – in fact I started my career the year the term RAID was coined. If a disk crashed, every effort was made to recover any data on the failing device, before resorting to backups.
Pretty quickly this process became untenable and as commodity disk drives in storage arrays became more prevalent, the disk drive was consigned the attribute of being “cattle” – if a drive fails, we simply discard it and replace with another. Meanwhile, our “pet” becomes the array itself, which gained all kinds of features like UPS, battery backups and redundant components to keep it running with high availability. We pampered our pets and invested time and money to ensure they ran with optimum performance; think of how much effort we storage administrators spent on distributing the data for best performance, implementing tiering and optimising caches.
Today, with web-scale architectures, the disk drive is simply a component of the hardware and the industry is moving to treat whole servers as commodity. Now servers are cattle and scale-out storage means we can simply replace a failing device with another and move on.
Moving To Cattle
What about managing the performance of our storage? How do we do this in a web-scale world? Unfortunately storage administrators simply don’t have time to manage the data on every server; even load balancing across multiple nodes is a thankless and futile task as the profile of mixed workloads changes over time. Instead we are in a position where there is a need to separate the functions of the hardware from the service level required for the data on our storage infrastructure and let software do the hard work. This also means saying goodbye to the hard drive for the majority of our production data.
Until the widespread adoption of flash, all I/O requests were serviced as fast as possible because the hard drive was the slowest part of the infrastructure by a country mile. Storage arrays were optimised to reduce the I/O load on the backend disk through caching and clever algorithms that complemented spinning media. However flash storage provides us faster, much more predictable I/O than hard drives ever could, allowing the features of the hardware to be abstracted from the service levels required by the application.
This is our next step of evolution in storage; rather than manage each array or node individually, we need to build in quality of service (QoS) features that deliver I/O in a consistent and predicable way, using flash storage and based on the application need. Storage administrators will in future manage an entire storage cluster based on QoS, adding physical resources where needed to meet capacity and/or performance demands. Intelligent software will be responsible for distributing workload across the hardware in the cluster.
Our new pet becomes the storage cluster itself, providing administrators the ability to start focusing on the next stage of evolution; the value of the data they are maintaining.