Featured Pure Storage Tech Talks

Multi-Cloud with Kubernetes and Pure Storage

For a typical enterprise, multi-cloud is the use of multiple cloud services for different and mostly orthogonal purposes. Using Office 365, Salesforce, and AWS for different independent applications is a good example of an organization using multi-cloud to fulfill their business requirements. Alternatively, hybrid cloud is the use of two or more clouds to host an application whose components span those clouds. For example, on-premises databases may house information close to where it is needed – providing low latency, while Microsoft Azure is utilized to host data warehouse services based on the data hosted locally.

There is a third category that could easily be lumped in with hybrid or multi-cloud, wherein an application is deployed as an atomic unit in multiple clouds. This could be a common scenario where a company is providing services to clients, and they are beholden to certain regulations or data sovereignty rules that require hosting the service in multiple clouds. The service might also be spread across multiple cloud services to provide redundancy, which given the recent outages across the internet might not be such a bad idea. Since the application is deployed as an atomic unit without linkages across clouds, it would make sense to classify it as multi-cloud.

Consistency Across Clouds

Deploying an application on multiple clouds presents some unique challenges. While each cloud has similar constructs in terms of compute, networking, and storage, the implementation of those constructs differs to a lesser or greater degree in each public cloud. Dealing with the vagaries of a particular cloud’s implementation requires a certain level of sophistication and expertise, making a multi-cloud deployment an administrative nightmare. What an organization needs is a consistent deployment model across multiple clouds without requiring in-depth knowledge of each cloud’s implementation.

The explosion of managed Kubernetes services in multiple clouds provides a potential solution to the challenge of multi-cloud development and administration. Kubernetes provides a consistent administrative interface and development platform across multiple clouds and leaves it up to each cloud service to handle the management of their Kubernetes platform.

There are three different consistency challenges to be met when dealing with a multi-cloud application deployment:

  • Development consistency – Can the application be deployed using the same constructs across multiple clouds?
  • Operational consistency – Can the operations team administer the applications and its supporting infrastructure using a common toolset across multiple clouds?
  • Performance consistency – Does the application perform consistently and predictably across multiple cloud environments?

Development Consistency

When an application is being developed, certain assumptions are made about the environments into which the application will be deployed. Differences between environments – such as development, QA, and production – can be a source of consternation and require additional development testing to ensure that the application functions consistently across disparate environments. Using container-based applications and Kubernetes lowers the variation between environments and streamlines the development process. There will still be variations between different public and private clouds, especially in the way that networking and storage are implemented.

Operational Consistency

Each cloud used in a deployment will have its own management layer. Ideally, certain components of the cloud management layer can be abstracted or consolidated into a single toolset for the operations team to use in the administration of a solution. Kubernetes provides a consistent application management layer for the Ops team to use across multiple clouds, but there is still the question of how to effectively manage persistent storage, networking, and the Kubernetes cluster itself.

Performance Consistency

While it is important that an application is performant, it is even more important that the application performs in a consistent and predictable manner. When an application is deployed across multiple clouds, it can be hard to achieve that level of consistency across each deployment. Differences in the implementation of the compute, networking, and storage layers create inconsistencies in application performance. Kubernetes helps solve the compute layer and some of the networking layer in terms of performance. Within the specs of each deployment, a pod can request specific minimum and maximum available resources. Kubernetes can allocate memory and CPU accordingly, but it cannot provide guarantees around storage performance.

Storage in Kubernetes

At a basic level, storage is presented to pods in Kubernetes through the Volume construct. A volume’s lifecycle is tied to the lifecycle of the pod and not the containers within the pod. If a particular container is terminated inside the pod, the volume persists. To provide persistent storage outside of the pods lifetime, Kubernetes has the concept of Persistent Volumes (PV) and Persistent Volume Claims (PVC). The use of PV abstracts the storage details from the developer and allows them to simply request (claim) a particular size volume from a pool of PVs. Persistent Volumes can be pre-created and consumed in a static manner or can be dynamically provisioned when a claim is made by a pod. In that case, a storage class is used to define how the claim should be implemented. In addition to a storage class, label selectors can be used to pick a specific storage class implementations, e.g. env = production and performance = ssd.

Within the storage class definition is a provisioner field that determines how storage is provisioned when volume is requested. The volume plugins that support provisioning can be implemented in-tree, meaning within the Kubernetes project itself, or out-of-tree. In-tree requires that the code for a particular volume type is included in the main Kubernetes repository, and therefore updates to that volume type are made in lockstep with the rest of Kubernetes versioning. Due to the difficulty of maintaining volume plugins in-tree, this plugin type has been deprecated in favor of out-of-tree volume plugins.

Out-of-tree volume plugins provide the ability for storage vendors to innovate outside of the Kubernetes release cadence. FlexVolume and Container Storage Interface (CSI) are two in-tree interfaces that enable the use of out-of-tree volume plugins and volume drivers by different storage vendors. FlexVolume plugins require that a local executable implementing the volume driver is installed on each node in the cluster to support the volume plugin type. The FlexVolume interface has been deprecated in favor of the Container Storage Interface. The CSI allows storage vendors multiple options for how they choose to implement their volume driver as long as it is consistent with the CSI spec.

Using an out-of-tree volume plugin with Kubernetes enables the use of third-party storage solutions. Those solutions can assist in providing consistency across multiple cloud environments by providing the same interface for developers, the same management platform for operators, and the same performance profile for the application.

Pure Service Orchestrator

In response to the need for multi-cloud consistency, Pure Storage has developed the Pure Service Orchestrator (PSO). The PSO provides an integration point between Kubernetes and multiple Pure Storage solutions including FlashArray, FlashBlade, and Cloud Block Store (currently in beta). While FlashArray and FlashBlade are on-premises offerings from Pure, the Cloud Block Store is a cloud-native storage solution that implements API consistent storage constructs in the public cloud. It is currently being beta-tested in AWS. The Cloud Block Store uses native AWS components, including S3 and EC2 to provide storage as a service, while integrating with Pure’s Purity management software and their on-premises offerings. Although the Cloud Block Store is using native AWS components, Pure’s software is driving the solution and is API consistent with their on-premises products.

The Pure Service Orchestrator provides an abstraction layer between the various Pure Storage solutions and Kubernetes. This enables a Kubernetes deployment to request a Persistent Volume Claim from the PSO for a particular service and performance level and let the PSO handle the allocation of storage on the correct corresponding storage type and tier. Storage administrators can continue to use the Purity interface to manage their Pure Storage solutions, while developers can create consistent deployments whether the target is on-premises or in the public cloud.

The Pure Service Orchestrator currently uses the FlexVolume interface to provide storage to the Kubernetes cluster. Therefore, the proper executables have to be deployed on each node in the cluster. In order to simplify the deployment process, Pure has leveraged the Operator SDK to create the PSO Operator deployment. The installation creates a Custom Resource Definition based on a helm chart and creates the necessary roles and context for the PSO to function. More information can be found on the GitHub repository. Since the FlexVolume interface is being deprecated in favor of the Container Storage Interface (CSI), Pure is actively working on updating the integration to use CSI.

Since the PSO is presenting storage from Pure as a Storage Class with dynamically provisioned Persistent Volumes, the source of the storage is abstracted away from Kubernetes. This means that the PSO could present storage from multiple storage devices, whether on-premises or in the cloud. The developers writing applications for Kubernetes do not have to worry about allocating capacity or selecting a storage array, they merely select the proper Storage Class for their application and let PSO worry about allocating storage properly on the backend.

Using the PSO along with Pure Storage solutions helps solve the question of storage consistency when it comes to a multi-cloud deployment. Developers can reference the PSO as a target for persistent storage regardless of where the application will be deployed. Operators can manage the backend storage with a consistent toolset across data centers and in the public cloud. Since the solution is using Pure Storage in each environment, performance will also be consistent and predictable. Although the Cloud Block Store is only available in beta on AWS today, there can be little doubt that Pure will be expanding the offering to the other major public clouds. The emergence of Kubernetes as a leader for multi-cloud consistent deployments has created a need for a consistent storage backend, and Pure Storage is stepping up to the challenge.

About the author

Ned Bellavance

Ned Bellavance is an IT professional with over 15 years of experience in the industry. He currently works as the Director of Cloud Solutions for a premier IT consulting company in the Philadelphia metropolitan area, specializing in Enterprise Architecture both on-premise and in the cloud. Ned holds a number of industry certifications from Microsoft, VMware, Citrix, and Cisco. He also has a B.S. in Computer Science and an MBA with an Information Technology concentration.

Leave a Comment