Data migrations are an inevitable part of IT. Systems come and go. Vendors come and go. But the data remains. Data persists whether or not it’s still in use. Depending on the type of data, there’s a really good chance the data is “cold” data – data that hasn’t been accessed in quite a while. Depending on your industry, you might even have regulations that prevent you from deleting data before a set number of years, if ever.
There’s also an interesting phenomenon occurring in parallel where the amount of data created is occurring at an ever-increasing rate and from more devices. These two forces, increased data retention, and increased rate of data creation, mean there’s more data being stored which puts pressure on storage admins to meet mission needs related to availability, integrity, and accessibility of data. This often leads to purchasing more storage, often from different vendors which in turn leads to migrations.
We have a number of ways to migrate data from a previous system to a new one. Some people run rsync on Linux boxes, Robocopy on windows, or maybe even an elaborate set of scripts developed in-house. Each method has its advantages and drawbacks.
The drawbacks are probably well known for those of us who’ve been around the block once or twice. Transmission retries could be spotty. Maybe the logging isn’t great. Probably the one that stings the most is not being able to reliably answer the question, “How long will it take?” At best, you may get an estimate, but the accuracy of that estimate will vary based on the tool used. Another question that traditional tools fail at answering is, “How much will this cost us?” Project performance is often measured by being on-time and on-budget. Without the ability to definitively answer such questions, the effectiveness of a project can’t easily be quantified. Successful IT teams produce metrics that can make it easier for the business side to understand IT value.
The Komprise Advantage
Komprise is a data management platform that provides data analytics, data archiving/tiering, and data replication/data tiering. Komprise also provides for data migration, which is copying data from source to destination. Komprise can do this on-prem or in the cloud. The on-prem version works to move file data between all the major vendors. Traditional filesystem types such as NFS and SMB are supported, as is object storage. What this means in a real-world example is you can use Komprise to move your files from a NetApp array to say, an Isilon system. Why would you want to do that? Maybe the NetApp array is getting long in the tooth. Or maybe because NetApp supports multiple storage types you decide to reconfigure the NFS mounts as block storage. There are many triggers to data movement, but when it happens, you want it to happen as fast as possible, with as little user disruption as possible, and with a high level of observability.
Komprise provides several additional enhancements to the migration process via its scalable architecture. The ability to add additional observers provides flexibility in terms of capacity to improve migration performance. This is somewhat unique in the industry as these are software additions instead of more physical boxes. Komprise refers to their parallel migration capability as elastic data migration architecture. They have benchmarks that claim up to a 27x faster copy time than rsync or robocopy.
In my evaluation, I didn’t capture benchmarks, though the idea of running parallel copies has been used by others to a similar effect – the principle is sound. Another useful aspect of the solution is that it migrates data without getting in the way. No agents are required for the scans or to complete storage-related actions. Performance remains consistent because the Komprise architecture stays out of the picture. As the figure below shows, the Komprise migration interface provides all the information needed to have a true sense of what’s happening at any point during the migration. These types of stats can often be gathered using a native vendor’s replication tool, but it’s hard to get when moving between vendors. Having how many files, the number of errors, and the transfer time histories are all empowering metrics.
Similarly, the ability to see how much cost is associated with the storage and the savings that come with moving data between locations or platforms. The recommendations for moving data are given in plain language and explain the amount of potential savings based upon the analytics related to the amount of data, access dates, and cost of storage collected by Komprise. Komprise allows users to provide their actual costs which makes the data extremely valuable in that it’s actionable.
Storage and system admins have a challenging role and anything that can make that job easier should be applauded. While vendors do a good job of providing tools that add value, the additional value is realized when staying within their ecosystem. Real enterprises are rarely homogenous places standardized on a single platform. Real enterprises need tools that can build bridges between multiple systems and provide visibility into what’s happening. A bonus is the ability to leverage data about the environment while providing recommendations. Third-party tools exist to fill holes in capabilities or to go places that vendor-proprietary solutions can’t. Komprise does pretty much everything you’d expect of a top-notch third-party tool.