Hammerspace unveiled a new storage architecture called Hyperscale NAS that addresses the needs of AI and GPU computing. This episode of the On-Premise IT podcast, sponsored by Hammerspace, is focused on the extreme requirements of high-performance multi-node computing. Eric Bassier of Hammerspace joins Chris Grundemann, Frederic Van Haren, and Stephen Foskett to consider the characteristics that define this new storage architecture. Hammerspace leverages parallel NFS and flexible file layout (FlexFiles) within the NFS protocol to deliver unprecedented scalability and performance. AI training requires scalability, performance, and low latency but also flexible and robust data management, which makes Hyperscale NAS extremely attractive. Now that the Linux kernel includes NFS v4.2, the Hammerspace Hyperscale NAS system works out of the box with standards-based clients rather than requiring a proprietary client. Hammerspace is currently deployed in massive hyperscale datacenters and is used in some of the largest AI training scenarios.
Combining Simplicity with Speed, with the New Hammerspace Hyperscale NAS Architecture
Data is the new currency of the modern economy. It has opened huge opportunities to drive trailblazing technologies like AI and machine learning deep into businesses and industries. But as storage systems lay jammed with volumes of unstructured data, legacy solutions are under threat. Data overabundance can easily overwhelm and disrupt these known storage solutions, leaving organizations at risk of being outperformed by their rivals.
This episode of On-Premise IT Podcast brought to you by Hammerspace, explores the reasons why the new data cycle requires next-generation storage systems. Eric Bassier, Sr Director of Solution Marketing for Hammerspace, talks about a new NAS architecture that can accommodate all the data that’s heading enterprises’ way, and do it at the speed require for AI training.
A Change Is in Order
“AI is forcing a reckoning in the industry that’s probably long overdue, to change how data is used and preserved,” comments Bassier.
Bassier puts storage systems into two main categories – the traditional scale-out network-attached storage (NAS), a technology already well-known and widely deployed in organizations, and the relatively new HPC parallel file systems designed exclusively for HPC environments.
“The fact that the HPC file systems have never been widely deployed in the enterprise speaks to a gap there. They don’t have the right feature set, and are too difficult to maintain,” says Bassier.
This is also telling of an uncomfortable truth about NAS systems. “The fact that HPC file systems still exist so predominantly in HPC environments is an admission that scale-out NAS architectures don’t meet their performance demands.”
What fundamentally separates HPC and AI workloads from traditional workloads is the need for speed and performance. GPU farms for AI training require to access data concurrently at high speeds.
A Disruptive Hyperscale NAS Architecture
Hammerspace has a new architecture, the Hyperscale NAS, that supports colossal data capacity and performance demands of GPU farms.
“[ the architecture] largely came out of our work with one of the world’s largest hyperscalers for their large language model training environment. It is a new storage architecture that as more and more enterprises get into AI and drive forward their initiatives, this would be the best storage architecture for large language model training, generative AI training, and other forms of deep learning,” says Bassier.
The unnamed client has a thousand-node Hammerspace storage cluster deployed in their LLM training environment where more than 30,000 GPUs are at work across 4000 server nodes.
“The Hammerspace storage cluster is feeding those GPUs at an aggregate performance of around 100 Terabits per second. It’s 80 to 90% of line rate,” he says.
Performance aside, the reason why the client chose Hyperscale NAS for the job is its standards-based design. Hyperscale NAS is standards-based, meaning it can operate on any commercial off-the-shelf storage server, be it NAS, object or block. One of the major benefits of that is, by just sitting on top of the storage, Hyperscale NAS can accelerate the underlying system without needing a costly upgrade.
“The underpinnings of this architecture have been in Hammerspace since day one.” Bassier points to the origin of the name “Hammerspace” to underline this. A hammerspace, he explains, is an extradimensional space invisible to the eye. Characters in movies and cartoons often use it to store unusually large objects, which they summon in times of need making it looks like they are conjured out of thin air. Think of Hermione Granger’s beaded handbag in Harry Potter, or Mary Poppins’ carpet bag.
Chris Grundemann comments, “Hyperscale NAS appears at first blush to be a representation of that. There’s no proprietary client software needed. It just works as a NAS but in a really new way, to support these crazy GPU workloads in AI.”
So, why did Hammerspace wait so long to introduce it? “We are bringing it to market now because of everything we’ve learned, where we’ve now proven this architecture at hyperscale,” says Bassier.
The paradigm is fast evolving. HPC and AI/ML workloads are going to be pervasive across organizations, and they will need a new NAS architecture that provides both the performance of HPC file systems with the right feature set, and the standards-based simplicity of Network File System (NFS).
Tying Together the Best of Both Solutions
In a scale-out NAS architecture, data has to make multiple network hops between the client and server. The more the hops, the higher the latency of transmission. The Hyperscale NAS architecture opens a direct data path between the two points, reducing the number of transmissions and retransmissions. The result is lower latency and faster throughput.
Metadata is dealt out-of-band. “We offload a lot of the metadata operations to a separate path so we can streamline it.”
Hyperscale NAS detaches data from metadata, putting them into two separate planes – the data plane and the control plane. The metadata resides inside the metadata service nodes which are essentially queryable databases.
This ties into another key aspect of the Hyperscale NAS architecture that Bassier highlights. Oftentimes, file systems are trapped in the storage layer that makes data opaque to the users. This is a barrier to collaboration works.
Hammerspace lifts the file system out of the storage layer and creates a global parallel file system with a single global namespace. Datasets are assimilated from multiple sources across sites and storage silos, and deposited into this file system. With global data orchestration, transparency is ensured for all users.
“Even users that are remote or not co-located with the data are all presented the same files that they’re authorized to see.”
Hyperscale NAS leverages NFS v4.2 client, particularly its two optional capabilities – parallel NFS and FlexFiles. “Hammerspace is the first one to take advantage of those capabilities,” says Bassier.
If Hyperscale NAS sounds a lot like an HPC parallel file system to you, then it is worth nothing that there are significant differences. Where others solutions rely on proprietary file system clients or agents that sit on GPU servers to give them the intelligence, Hammerspace doesn’t, and works with all standards-based clients, he concludes.
To learn more, visit Hammerspace’s website. Also check out Hammerspace’s presentation from the recent AI Field Day event to get a deep dive of the architecture. Other interesting content to check out are Alastair Cooke’s article, and Keith Townsend’s writeup on Hyperscale NAS.
Podcast Information:
- Stephen Foskett is the Publisher of Gestalt IT and Organizer of Tech Field Day, now part of The Futurum Group. Find Stephen’s writing at Gestalt IT and connect with him on LinkedIn or on X/Twitter.
- Frederic Van Haren is the CTO and Founder at HighFens Inc., Consultancy & Services. Connect with Frederic on LinkedIn or on X/Twitter and check out the HighFens website.
- Chris Grundemann is the Managing Director at Grundemann Technology Solutions. You can connect with Chris on LinkedIn and on X/Twitter or visit his website to learn more.
- Eric Bassier is the Senior Director of Solution Marketing at Hammerspace. You can connect with Eric on LinkedIn and learn more about Hammerspace on their website and by watching their presentations at AI Field Day.
Gestalt IT and Tech Field Day are now part of The Futurum Group.
Follow us on Twitter! AND SUBSCRIBE to our newsletter for more great coverage right in your inbox.