It is time to stop waste in high-performance servers and storage; we shouldn’t buy and use more just to get less. The idea that everything is solved in software is based on low-cost x86 servers and storage. It doesn’t encourage efficient use of hardware so much as platform flexibility. Is it time to recognize that compute resource is not infinite or free? The use of GPUs, FPGAs, and ARM CPUs suggests that specialized hardware designed for specialized tasks is not a dead idea. On the other hand, treating everything like a commodity means we don’t use the unique features, and we may not get the most value for money.
SSDs Aren’t Just Faster Hard Disks
When you put an SSD into a server, the operating system mainly treats it the same as a hard disk, yet SSDs are very different. On your laptop, that isn’t a big problem, the SSD doesn’t get worked too hard, and you will only see performance degradation when the SSD gets full. On a server, particularly an application or database server, things are different with a much higher I/O rate. To support hard disk-like small writes, SSDs have a read-modify-write behavior because they work with larger blocks of data compared to hard disks. There is a slower process, garbage collection, that reclaims and consolidates freed pages to make fresh blocks that can be written. When the SSD receives many small writes, it may use up all of its free blocks and must wait for garbage collection before it can do the next read-modify-write cycle. To avoid this problem, server SSDs overprovision capacity to avoid running out of free blocks, which means the more SSD capacity you pay for but cannot use to store data. The other risk on servers is that SSDs experience wears when they are written; each read-modify-write wears a little more life off the SSD. Many small writes, particularly modifying existing data, will wear the SSD rapidly, leading to early failure.
Protecting From Drive Failure at High Speed
The usual way to mitigate disk failure is with RAID, and SSDs are often put in RAID10 configurations. RAID10 means buying 2 SSDs for every SSD worth of data you want to store. You could consume less SSD capacity by using RAID5, but then you get one read and two writes to SSD for every write to the array. Did I mention wear? RAID5 doubles the wear rate. It would be much better for the SSDs if we wrote entire blocks at once and never modified those blocks after they were written. Deduplication is fantastic for SSDs; we can just store unique blocks. Alas, your operating system and database do not store deduplicated data. It is up to your storage controller to deduplicate and optimize data.
Host CPU
There is a silent consumer of your server’s CPU, and it is your storage. Even with direct-attached storage, there can be a significant CPU load waiting for storage tasks like compression to complete, using up precious CPU resources. Sometimes this waiting time is reported the same as idle CPU time, but the CPU cannot do other work while it waits on IO. For example, a 70% idle wait server is performing far better than one with a 70% IO wait. If you could offload those storage tasks, then your CPU could do more work for your application. A storage processor is an offload, or an accelerator if you prefer that name, to free your CPU from excessive IO wait.
Accelerate and Optimize
The Pliops Storage Processor (PSP) is exactly this sort of offload. The PSP receives the IO request from your CPU and acknowledges it as soon as the request is in NVRAM, freeing your CPU. Then the PSP assembles several requests together before sending them to the SSDs. Rather than your CPU being at 70% IO wait, the CPU might only be 10% IO wait and able to do more application work. The PSP also compresses raw data and can significantly reduce already compressed data to get the smallest data footprint possible. The PSP makes sure to write entire SSD blocks, meaning less need to overprovision and longer SSD life. With this careful handling, Pliops can extract the most performance and reliability from your SSDs, even allowing you to use lower cost (QLC) SSDs that would otherwise wear too fast and be too slow.
You can learn more about Pliops by visiting their website and checking them out at the upcoming Cloud Field Day 11!