RocksDB is the data engine of choice for key value store (KVS) data by hyperscalers from Facebook (who originated it) to Redis. But RocksDB faces performance challenges at scale, requiring a great deal of tuning to get right. Launching this week is Speedb, a company with a replacement for RocksDB that offers better performance and scalability with much less hardware investment. It’s about time the knowledge of the storage industry reached the world of data!
Storage Is Hard to Get Right
The storage industry is littered with failed companies, and most of these were focused on building a better general-purpose storage array. When this works, it can be a phenomenal success, as was the case of Data General, EMC, NetApp, Pure Storage, Infinidat, and lately VAST Data, but there are many more stories of failure. The world doesn’t actually want better storage, it wants better applications, and this is difficult to deliver with a storage array.
But storage is hard to “do” right. Just as many failed companies tried to build their own storage engine, only to realize that this seemingly-simple component is hard to get right. Some of the greatest application challenges are due to inefficient storage concepts that ignored decades of lessons well known among storage-focused engineers. Then even successful projects like Docker, Kubernetes, MySQL, and RocksDB spend years spinning their wheels to optimize fundamentally-wrong approaches to storage.
From LevelDB to RocksDB
Let’s take RocksDB as an example. This Facebook project is based on LevelDB from Google, an on-disk key-value store conceptually descended from their seminal Bigtable database system. Although LevelDB was a useful tool for Google, it was optimized for batch and sequential access on disks rather than today’s scale-out random-access microservices. Facebook forked LevelDB as RocksDB, with the stated goal of improving server workloads with parallel operations and features like backups and snapshots and database-style transactions.
RocksDB is a huge improvement over LevelDB and other key value stores and has become ubiquitous in modern web scale applications. It is the embedded storage engine in Ceph BlueStore, Apache Flink, and Yugabyte and an alternative backend for Cassandra, MariaDB and MySQL, and MongoDB. And this makes it a key component of everything from Facebook and LinkedIn to Redis.
The success of RocksDB has become something of a liability, since its unique architecture poses a challenge to the underlying storage systems used by these applications. Much energy has gone into tuning and tweaking the storage beneath RocksDB, with relatively little success. Most users today simply throw money at the problem, using NVMe SSDs with plenty of bandwidth, rather than doing the hard work of getting great performance in their environment like Yugabyte. And storage vendors are pulling out their hair to tweak their arrays to improve RocksDB performance with little success.
Speedb’s Next-Generation Data Engine
Speedb is taking a different approach. Rather than treating RocksDB as a black box to be worked around, they are building a replacement that incorporates the institutional knowledge of the enterprise storage industry. The company claims that SpeeDB (the product) delivers ten times the IOPS, 100 times the scalability, and 80% less resource utilization than RocksDB. And their solution is endorsed by Redis Ltd., which is rapidly adopting SpeeDB across their data platform.
The Speedb team (they appear to use a lower-case “db” when referring to the company) is made up of industry veterans with notable links to Infinidat, a respected alternative to the “big iron” storage solutions from Dell EMC, HPE, and Hitachi. The company was founded just a year ago and raised seed funding last November to get started. They came out of stealth yesterday and we expect that their profile will rise significantly based on the Redis solution.
Note: This product is unrelated to a genomics data project also named SpeeDB
Stephen’s Stance
In my position as Editor of Gestalt IT and organizer of the Tech Field Day events, I talk to lots of companies. Speedb reminds me of some of the most successful startups in that they are attacking an acute industry need (RocksDB performance) with credible technical skills. Rather than developing yet another storage array, Speedb is solving a critical challenge felt across a massive number of customers. SpeeDB (the product) seems to work, and I suspect that Speedb (the company) might too.