All Solidigm Solidigm 2024 Sponsored Tech Field Day Events

Solidigm and Why Storage Is Critical to the Success of AI

With Generative AI showing a year-over-year growth of 500%, and companies like Supermicro shipping out 5,000 integrated racks every month, the need for high-performance storage that can work with both CPUs and GPUs is on the rise. 

Storage needs to run efficiently, and at speed, so that GenAI models can run optimally without stopping even for a second. Because the second it goes down, production and money are both lost. 

So, let’s talk about what GenAI needs to retrieve and store data, and why is it important that all this is organized and cataloged for the future. 

Challenges in AI Storage – The Need for High IOPS

Massive datasets and complex computations from multiple users at any given time means computers need high Input/Output Operations Per Second (IOPS). It’s no different from when the typewriter keyboard layouts were formed. There were more efficient layouts than QWERTY, the Dvorak keyboard, for example, but the bottleneck was the hammer keys themselves. The typewriter would jam too much because of information coming in too fast from different directions. Hence, the paper was either blank, or the letters were disorganized. 

Imagine if the hammer keys, in this case, IOPS, got quietly stuck during an AI computation? If you are running a larger AI query, you may never get the results you desire. Having to restart queries takes time and leads to low-quality outcomes if the AI cannot research the database for answers. 

The I/O Blender Effect

The “IO Blender effect” refers to the complexity of Input/Output operations in storage systems, especially in environments running multiple AI workloads simultaneously. Different data access patterns blend together from the applications that are used. This creates a challenge for storage systems to organize for optimal performance. With millions of tiny I/O reads and writes along with NVMe, storage can get faster than the CPUs that are trying to read/write it. Its the typewriter key problem all over again.

The Significance of Data Management

Organizing, storing, and processing are the three key pieces of AI model training. It’s important to have proper data management so when someone queries, the answers they get are sound and reliable. 

AI can then turn around and create data models for more accurate responses. The goal is to generate actionable AI insights. These insights help produce faster, and more clear-cut answers to queries. 

Other Considerations

At AI Field Day in California, Supermicro’s Wendell Wenjen and Paul McLeod demonstrated the hierarchy of the Gen5 storage platforms that are built for AI-centric software-defined datacenters. They include: 

  • Single or Twin HDD for high-capacity storage but low AI functions
  • A Hybrid of SSD and HDD for equal AI and storage performance
  • CXL, TLC, and Solidigm QLC SSDs for highest performance of AI queries in GPU clusters 

CPUs and GPUs will likely come under a lot of pressure in each of these cases. But, a hierarchy as above will provide high capacity, and keep costs lowdown. It can be tested and repaired, if needed, while AI models continue to be created and run. 

Where Generative AI Goes from Here Depends on…

Simply put, if data cannot be referenced properly, or if key data is missed because IOPS are slow, or AI insights are incomplete, it can compromise future data. Imagine putting into a model the information – “2+2=4”, and it reorganizes it to “2+2=tangerine” and never corrects it. Billions of queries will have wrong results.

As companies start to rely more on GenAI, every second information is wrong or unavailable, the end result can be catastrophic. That is why storage that is reliable, efficient, and fast is imperative for Generative AI. It’s critical to have optimized storage with high IOPS and effective data management. Otherwise, there will only be pages of incomplete and incoherent information in the archives. 

For more, be sure to check out Supermicro and Solidigm’s presentations from the recent AI Field Day event.  

About the author

Jeffrey Powers

Jeffrey Powers covers consumer, enterprise, auto, and music technology, along with audio/video podcasting. His web brand – Geekazine.com – keeps you up to date with interviews, reviews, and much more. Geekazine LIVE! is the live stream, which can be seen on Twitch and YouTube, and covers events for many clients. Jeffrey is also one-half of Build Day LIVE! – an event covering Enterprise Day-One installations.

Leave a Comment