There’s a surreal quality to containers. They appear and then vanish, summoned into existence by an invisible orchestrator, only to melt away moments later, like the clocks in Dali’s famous painting.
A container’s lack of permanence can be disorienting, particularly to storage administrators whose professional calling is to capture and preserve data.
Yet the impermanence of containers is by design. Containers run individual processes and if a process isn’t needed at a particular moment, the container doesn’t need to exist.
This design preserves resources such as memory and CPU. It’s ideal for testing and development environments where functions can be spun up and then shut down as needed.
It’s also tailor-made for applications in the public cloud, where the provider charges customers based on consumption. By dismissing containers when not in use, organizations can more closely manage their cloud usage and help control costs.
At the same time, orchestration and auto-scaling mean that containers can be added back as necessary. Rather than pay more to overprovision an application to account for intermittent spikes, you can simply instantiate more containers when they’re needed, and then whisk them away after peak usage subsides.
Persistent Volumes to The Rescue
But what about the storage admins, whose jobs depend on capturing all of the bits? And what about stateful applications such as databases that require persistent storage?
The lack of persistent storage was an issue in the early days of containers, but it has been largely solved today. In particular, Kubernetes, the container orchestration platform, has addressed this problem with three key constructs: the Storage Class (SC), Persistent Volume (PV), and Persistent Volume Claim (PVC).
By enabling persistent storage, Kubernetes environments can support stateful applications, as well as ensure that data is protected in the event of an unintended container failure or restart.
As with other resources such as memory and CPU, Kubernetes abstracts away the underlying physical storage. Application developers don’t have to have specific knowledge of the array. Instead, storage administrators ensure that a sufficient quantity of storage, with the requisite performance and protection characteristics, is available to the Kubernetes cluster. Developers simply tick a few boxes and Kubernetes works with the array’s orchestration system to provision the storage.
Let’s review the framework that enables persistent storage:
Storage Class (SC)
Within a Kubernetes cluster, the Storage Class object is an abstraction of the underlying storage. A cluster can have different types of storage classes available, such as block or file storage. Other attributes, such as performance or use case (i.e. backups), can also be defined.
Persistent Volume (PV)
PV designates that some amount of storage available to a cluster will be persistent. That is, even if the container attached to that storage goes away, the data will remain. PVs can be provisioned by an administrator or dynamically.
Persistent Volume Claim (PVC)
The Storage Class and Persistent Volume settings designate that a pool of persistent storage is available to a Kubernetes cluster. The PVC reserves a specific portion of that resource pool.
Like a restaurant reservation that sets aside a table with specific characteristics (number of seats and time of reservation), the PVC sets aside a portion of the available storage resources for a specific workload. The PVC also requests specific characteristics, such as capacity and access mode.
Orchestration All the Way Down
Kubernetes must interact with software provided by the storage maker to enable storage with the correct characteristics required by the application.
For example, Pure Storage has developed the Pure Service Orchestrator, which is software that integrates with Kubernetes to ensure that its arrays can correctly provision storage resources for container-based applications, and that both storage administrators and developers can leverage automation tools and frameworks to streamline provisioning and consumption.
Other vendors also support persistent storage for containers, such as Red Hat’s OpenShift, as well as public cloud providers, including AWS and Azure. In other words, whether your developers are writing container-based applications for the data center or in the cloud, they can be assured the right storage for the application will be available.