I have to be blunt. Pure Storage threw me for a curve in my first ever briefing from them as part of the Tech Field Day Extra (TFDx) event at Pure Accelerate 2019. I knew Pure Storage as an all-flash array (AFA) pure play. Pure Storage is known for bringing AFA to the mainstream. So much so, during the show, Pure Storage made an argument for AFA replacing general-purpose hard disk-based arrays. Pure asserted that AFA is cost-competitive with HDD arrays. It’s from that lens I’ve viewed Pure. So, when they made the pitch for their first cloud-based service, I didn’t expect much from the company that I saw as a hardware company.
What I Thought
When Pure Storage showed a high-level overview of Cloud Block Service (CBS), I wasn’t impressed. I assumed it an NVMe EC2 instance with the Pure Storage OS. I saw that the cloud resources would exist in a customer’s AWS VPC. The decision to outsource billing and networking to the customer triggered some design assumptions in my mind’s eye.
I’ve seen similar solutions, and while great for non-production use cases, there were just too many operational issues related to the limitations of AWS. A simple example is expanding the storage array. It would require resizing the EC2 instance for more CPU. Adding CPU is a disruptive change in AWS. Another challenge is managing redundancy as it’s a single node, and the SLA’s for EC2 instances isn’t nearly the level needed for a production array.
I Was Wrong
Fortunately, all my assumptions about the design were wrong. Pure Storage has done something unique in the industry. Before getting into the design, it’s essential to understand who the intended customer is for CBS. VMware broke the dam for hybrid cloud with VMware Cloud on AWS.
VMware surfaced the desire for customers seeking a hybrid infrastructure to keep consistent operations between the public and private cloud. VMware Cloud on AWS maintains customers’ investment in operating models by providing a consistent administration experience between vSphere on-premises and vSphere in AWS.
CBS attempts to bridge the same gap. The design goal of CBS is to keep a consistent operating model between on-premises Pure Storage AFA and CBS. Customers leverage the same constructs and tools to provision and replicate storage. The idea is to ease the transition to cloud-native services and provide a consistent storage underlay between cloud-native applications and traditional enterprise workloads.
Cloud Block Services Design
It’s important to note that there’s an overlap between potential VMware Cloud on AWS customers and potential CBS customers. However, CBS doesn’t support mounting to vSphere in VMware on AWS as a VMFS data store. According to Pure Storage’s presentation at CFD6, no solution other than VSAN is supported as a data store in VMware Cloud on AWS. According to Pure Storage’s presentation, customers must mount non-VSAN storage, including CBS, from a guest VM within vSphere. According to VMware’s documentation, some solutions support non-VSAN storage.
For the reasons stated above, you simply can’t install Pure Storage’s software on a single EC2 instance and have a production-quality storage array. Pure leverages multiple EC2 instances to create a high-performance and highly redundant set of virtual disks. These virtual disks front-end S3 storage that provides persistence. Two additional EC2 instances act as redundant controllers that provide the presentation layer. By leveraging high bandwidth EC2 instances, Pure Storage can provide SMB and NFS based block storage to customer workloads residing in the same AWS VPC.
There’s a lot to like about CBS. It’s a familiar abstraction to keep operations consistent. The infrastructure resides in a customer’s VPC, which provides a lot of knobs to control egress charges. CBS is also a 1.0 product with a lot of missing features. Pure Storage gave test and development examples for use cases. One of the main advantages of the public cloud is the ability elasticity for workloads such as test and development. However, CBS isn’t very elastic yet. While you can expand the underplaying storage by increasing the size of the EC2 virtual disks, Pure Storage doesn’t have a way to reduce the size of the storage array in the cloud. Also, the solution is very specific to AWS.
Pure Storage says that it will approach each public cloud with a unique solution. I expect this to impact the speed of delivery for Google Compute Platform (GCP) and Microsoft Azure. On AWS, Pure Storage introduced a unique approach and deserves credit for engineering creativity.