It’s no secret that the amount of data generated – personal, proprietary, and public – is increasing at an exponential rate. This extra data results in larger sizes and quantities of volumes, virtual machines, and other managed objects. Because storage system capacities are simultaneously increasing, they can accommodate many more of these larger objects. While supporting explosive data growth, these developments have also increased the likelihood of resource contention and noisy neighbors that can affect performance and productivity.
In this increasingly complex environment, the traditional process of managing these objects with human labor has made its limits painfully clear. There is a limit to the amount of data and number of resources an admin can keep track of or optimize before they suffer from information overload and start making suboptimal decisions.
Automation vs. Artificial Intelligence
A better approach to managing this complex environment is to involve the storage system itself in management activities. An artificial intelligence algorithm is awake 24/7 and can keep track of many more object relationships than human beings. It can also use historical data to look back further and with higher accuracy. By acting on this complete set of data, AI can take the right action to ensure your virtual machines stay online and perform optimally.
One example is when a storage system automates a chain of (repetitive) tasks, enabling you to focus on the result you want—for instance, a well-performing database. You do not want to worry about individual tasks. Tasks such as creating a volume, ensuring it has adequate performance without impacting other volumes, selecting the system with the largest amount of free space, and ensuring consistent settings across similar databases are better automated.
While such automation is useful, does this truly represent artificial intelligence, or is it just a scripted set of tasks? With AI comes the expectation of autonomous operations. If a system, for example, notices any resource contention, it should accurately analyze the problem’s source and come up with the best solution. If that solution requires moving some objects to a different location or throttling down an anomaly, it may – indeed it should – perform these tasks autonomously.
AI needs a lot of data
In many cases, such as the above example, your AI algorithm will need lots of data to make an accurate decision. While this does not necessarily mean ‘big data’ per the public definition, the analyzed data needs to adhere to some essential standards.
First, the data should span a long enough window because longer windows of data help the algorithm spot trends. Let’s take a scenario where you want a database to load periodically to refresh a (test) environment. If your data only covered a limited window, you could mistake a recurring peak for an anomaly and make the wrong decision to throttle it down. Or vice versa, you could overlook an actual anomaly and inadvertently choose to do nothing.
The data and analysis should also provide sufficient granularity or detail. The most elusive performance problems are the ones that are short-lived and infrequent. If you measure data too infrequently, you could miss these peaks and not act. Just measuring a big blob of bytes (aka a LUN) instead of the underlying object (for example, a VM) does not help with accuracy.
Capturing all of this data amounts to a vast number of data points – especially with tens of thousands of managed objects, and it means saving weeks or months of data at a time, in great detail. However, if done right, the addition of artificial intelligence to the storage system management plane will help organizations squeeze the maximum efficiency out of their storage investment. Simultaneously, it is easier for administrators to manage this ever-increasing number of storage objects.
But an AI algorithm cannot do all of this the right way without the right data set. In the next blog post, the second of this three-part series, we will look at how to supply the AI with the correct data and help the storage system make the right recommendations.
To learn more about Tintri, check out its recent Storage Field Day presentation!