Artificial Intelligence (AI) and machine learning (ML) have played a role in many society-benefiting endeavors like the lightning-quick development of a COVID-19 vaccine. However, operational challenges abound from managing and storing petabytes of data. At Storage Field Day in January, NetApp highlighted some of the platforms and tools in its portfolio built around adding value to AI/ML workloads and solving some of the problems organizations face in AI/ML adoption.
The Life-Changing Magic of AI/ML Adoption
Prior to the race for a COVID-19 vaccine, development and regulatory approval of a vaccine took decades. However, in less than a year, millions of people worldwide have been vaccinated or will soon be vaccinated by one of the available COVID-19 vaccines. AI/ML adoption enabled pharmaceutical companies like Moderna to sift through massive data points to glean critical insights, make predictions, and spot patterns.
Organizations outside of health sciences also turn to AI/ML applications to drive improved business outcomes and gain a competitive edge. Regardless of the vertical, organizations face some operational challenges when it comes to AI/ML implementation. Ingesting and labeling information often means managing and storing terabytes or petabytes of data. Often, this data is spread across multiple platforms on-premises and in public clouds. While these problems are not unique to AI/ML workloads, the scale is often much different.
Bringing Performant Storage to AI/ML Workflows
At Storage Field Day, NetApp spotlighted some of the platforms and tools in its portfolio built around adding value to AI/ML workflows. The basis of NetApp’s portfolio starts with performant storage. NetApp covered a few of its platforms suited for these types of workloads.
First up is StorageGrid. StorageGrid is NetApp’s software-defined data management solution for unstructured data that serves up object storage that can be tiered into a public cloud. Typically, object storage is not known for its performance or speed, but StorageGrid has All-Flash options that can challenge the notion of slow object storage.
At a multi-petabyte scale, NetApp StorageGrid becomes an appealing option. NetApp claims to be much more cost-effective, and if it runs on AFF, it’s more performant. Egress from the public cloud can quickly become costly, and AI/ML workloads that run in a public cloud often get hit with these charges. Additionally, NetApp has other solutions like ONTAP AI. ONTAP AI is a converged solution and verified architecture that couples NVIDIA DGX servers with GPUs with NetApp AFF storage to boost AI/ML and DL workloads.
Delivering Data Management and Copy Management Tools
Organizations that run AI/ML workloads often spend copious amounts of time copying data sets and moving data around. NetApp highlighted a few of their tools that are helpful in the ML, DL space. Like rsync or robocopy, but multi-threaded and more robust. XCP is a tremendous client-side tool for performant copying of data lakes or moving data in and out of a data lake. Cloud Sync is a service that enables hybrid replication support for source and destination NFS/SMB/S3 protocols of NetApp systems on-premises to the public cloud. For other use cases, SnapMirror and FlexClone are also options for quickly moving or copying data.
Open-source and available on GitHub, NetApp’s Data Science Toolkit gives data scientists programmatic access to ONTAP for functions like creating volumes, cloning, and creating snapshots for traceability. Dev-friendly NetApp Data Science Toolkit is Python-based, and functions can be called via ONTAP API OR via importable Python Library that can be used within existing ML tools like Jupyter Notebooks.
Many challenges that organizations face when running AI/ML or HPC workloads come down to having access to performant storage and moving the data to where you need it. Tasks frequently include copying gold source datasets for experiments or ingesting data in the data pipeline on a timely basis. NetApp might not be the first vendor you think of for storage and data management of AI/ML workloads. However, NetApp has built a broad portfolio and software solutions that can add value to these next-gen workflows. To learn more about how NetApp solutions can add value to AI/ML workloads, tune into NetApp’s presentation from Storage Field Day.