All Komprise Tech Talks

Simplified Data Archiving with Komprise

Komprise is a data management company that can address many of an organization’s issues related to rising storage costs and backups. The Komprise Intelligent Data Management platform makes recommendations around which data to archive and can simplify the archival process.

Archiving data to a cold location is something that can make a significant difference in cost savings. The idea has become more mainstreamed by public cloud vendors such as AWS. Unlike AWS, however, Komprise’s archiving is transparent to users and applications. Files can be accessed without having to be copied back to the original NAS – they stay in the destination location in their original format. This is a powerful benefit as it eliminates the need for IT to retrieve files, train end-users in how to get to files elsewhere, nor alter application configurations to point to the archive location.

Challenges of Backup

Backup can be challenging for any number of reasons. The amount of file data in an organization is constantly growing along with the types of files created. Knowledge workers still produce spreadsheets and word processing documents. Organizations are also seeing a rise in video as it becomes increasingly used by companies for both internal and external communications. While video file formats have improved over time, giving us higher resolutions in smaller byte counts, the main trouble admins have with backup is that there’s always more to protect, not less.

A sound backup strategy usually leverages 3-2-1 methodology. This means making three copies of the data onto two different media with one copy offsite. Years ago, some organizations could keep a full backup on a small number of tapes. Tape technology improved with the arrival of many Linear Tape Open (LTO) write speeds and compression ratios. Backup media evolved from tape to disk as prices came down, volumes increased, and backup windows were increasingly missed. Moving to disk alone couldn’t solve the problem.

Solving Backup Challenges

It could be argued that migrating backup to disk just exchanged one set of problems for another. Organizational data has grown far more than before at an ever-increasing rate. Using a 3-2-1 methodology grows increasingly expensive. Let’s take a simple example of 1 petabyte (PB). 1 PB of actual data would mean 3 total PBs of storage required. There’s 1 PB that’s the original, then there are another 2 PBs for copies. The cost and effort of storing data 3 times can escalate to an uncomfortable amount with the amount being created.

Archiving (or tiering) data would help address the problem of backup by helping organizations identify and focus on the data that needs to stay warm, or readily accessible. In the 1 PB example, we might find that only 25% changes frequently enough to meet the criteria to be considered “warm” data. That means 75% of the data in our 1 PB example is being backed up over and over again unnecessarily at the expense of the Enterprise.

Komprise has a transparent archive feature where the file is moved to a new location and a link is left in the original file’s location. The result in terms of our 1 PB example from before is that the 25% of warm data and a bit more for the file pointers is all that would need to be backed up nightly. Storing about 250 TB is a lot easier than 1 PTB. Moreover, moving 250 TB offsite is much easier and possibly more affordable if you’re paying for egress costs. Komprise doesn’t lock users in. The files can always be accessed directly from the new location.

Migrating data isn’t a brand-new idea. AWS offers this capability natively, for one. However, there are still advantages in using Komprise. AWS offers multiple levels of access from S3-to-S3 IA, to Glacier, and to Glacier Deep Archive. Each storage tier offers a tradeoff of price to retrieval time. AWS offers a native lifecycle policy feature to move data between tiers, but there are limitations.

AWS limits lifecycle policies to that type of storage. Files on S3 are archived to other tiers of S3. Files stored on NAS (EFS) are archived to other EFS tiers. Komprise offers this ability to archive files from EFS to S3 among others for a more full-featured file management platform. The time to retrieve data from the deep archive can be as slow as many hours per gigabyte requested. Anyone planning to put data in that tier should be certain that the data archived is truly cold. AWS policies are based on the time the file was last modified. 

Last modified time isn’t the best measure of whether or not a file should be archived. It’s a blunt instrument and leaves you at the mercy of data being moved back and forth between tiers at the expense of the customer upon retrieval.

Komprise Makes it Simple

Komprise makes it possible to reduce the backup targets to only the active data in your environment. This leads to several benefits. The first is smaller backup windows. It’s going to be faster to back up data. Another benefit is the cost of backups is lowered due to the fact that less backup target storage is needed. But perhaps more interestingly, another cost that could be reduced is licensing costs.

Depending on how your current backup solution is licensed, fewer licenses (or perhaps licenses of different types) could be needed. Different vendors have different licensing models with some choosing to license by capacity. In such an instance, Komprise would, at the very least, slow the need to increase backup capacity licenses, but best case, a reduction in expenses would occur. Moreover, Komprise eliminates the need to rehydrate data when the files or objects are accessed. Cost savings can occur because the files could be in the cloud where storage is less expensive yet accessed in their native format without copying back to the source NAS. 

But there’s also yet one more saving that could trump all of it – admin time and effort. Simplifying backup by reducing the amount of backup infrastructure typically translates to a lower mean-time-between-failure, which means fewer problems to fix. Fixing fewer problems means admins can address other areas of organizational technical debt. And that really is where a product shows its value.


About the author

Nathaniel Avery

Systems Engineer with 20 years of experience designing, planning, and implementing complex systems integrating custom-built and COTS applications.

Leave a Comment