Featured Infinidat Tech Talks

INFINIDAT InfiniSync – A World of Infinite Possibilities in Zero RPO Synchronous Replication

InfiniSync

Last week INFINIDAT announced some major milestones to expand their product portfolio. I’ve already covered the technical foundations of INFINIDAT’s enterprise innovation in my first post. Today, let’s take a closer look at one of last week’s announcements: InfiniSync.

InfiniSync is a synchronous, zero RPO replication solution that works over infinite distance. I already hear some raise their hands and shout “hold on!” Fear not! We will cover the “how” soon enough.

Disaster Recovery / Business Continuity Imperatives

The implementation of business continuity/disaster recovery strategies is a precise discipline backed by a pristine business rationale, i.e. maintain the business activities (and the organization’s capabilities) intact or with as much capability as possible in the case of major catastrophes.

When critical assets/processes have been assessed and it’s determined that they are eligible to be protected by disaster recovery, a new series of challenges and questions arise for the organization:

  • Where should we locate our secondary site?
  • How should we protect our mission-critical systems?
  • Will our current performance requirements be impacted by the DR solution?

Usually, the Recovery Point Objective (RPO, aka how recent is the data we can recover) and Recovery Time Objective (RTO, how long it takes to bring our systems up again) requirements will dictate the answers to most of the questions above. Of course when asked, every customer will immediately reply that they want a RPO and a RTO of zero. While this sounds funny for many consultants, let’s also bear in mind (at the risk of repeating ourselves) that for many organizations, systems in BC/DR scope are mission-critical systems which are the lifeline to the business’ ability to thrive.

Data Center Location – Not Always A Happy Choice

While the business and IT provide requirements about performance, RTO and RPO,  the location of the secondary site creeps in as another constraint in projects. This often goes beyond pure IT/business requirements and encompasses location, link speeds, availability of adequate facilities, commute times for support teams, etc.

Because of these considerations, it is not infrequent to see data center schemes where primary and secondary sites, while distinct, are still relatively close, often within the same metropolitan area. If we consider the potential “blast radius” that a major disaster can create, this relative proximity can cause both the primary and secondary sites to be affected by the same event. Needless to say, this would negate any of the planned precautions & benefits of implementing a BC/DR strategy.

Organizations that can afford to spread out their primary and secondary (and even sometimes tertiary) data centers across larger distances are logically more likely to offset the risks posed by a major disaster compared to those who decided to settle for sites at closer distances. However, his spread poses a challenge, how do you ensure the data generated on a primary site is transferred to the other sites fast enough to avoid any data loss in the context of Tier1 mission-critical systems?

We find on the market various kinds of solutions that are unfit for this challenge:

  • Synchronous, zero RPO solutions that work within limited distances (usually within 100 km / 50 miles), with a potential performance impact that increases with distance. This is because of data write acknowledgement requirements and elevated round-time trip based on network latency.
  • Asynchronous, “near-zero” RPO solutions that are usually more realistically within the 5 to 15 minutes RPO timeframe, allowing for greater distances at the cost of potential data loss. However, they do not meet the very stringent requirements of mission-critical applications.

Traditionally, what has been done to address the challenge has been to create a hardened “bunker” site to locally replicate Tier1 workloads at acceptable speeds, then to replicate them asynchronously at longer distances, leading to expensive and ill-used three-site DR topologies. There has to be a way to implement an efficient BC/DR strategy that can deter most if not all of problems above, but how?

InfiniSync

INFINIDAT considered this problem and came up with a solution called InfiniSync, which aims to solve the problems mentioned above by:

  • Implementing synchronous, zero RPO replication at infinite distance, avoiding the blast radius of regional disasters.
  • Reducing the performance impact to negligible levels.
  • Allowing easy and fast recovery of applications/systems in case of disasters.
  • Eliminating the need for three-site replication topologies.

InfiniSync is a hardened hardware appliance (and when we mean hardened, we’ll come back to this soon) located alongside an existing InfiniBox array. This in turn communicates with a distant InfiniBox array at a secondary DR site.

InfiniSync leverages INFINIDAT’s snapshot technology, InfiniSnap, to create instant, zero performance impact snapshots that are replicated in less than 200 microseconds between the InfiniBox array and the InfiniSync appliance. Data acknowledgements are immediate and done without performance penalty to the main InfiniBox array. INFINIDAT’s lightweight snapshot structure allows the almost instantaneous replication of the data to the distant InfiniBox, either over a wired connection or, if all else fails, broadband LTE. It is also possible to use the InfiniSync’s protected Wi-Fi network to locally extract data by IT operations teams during primary site recovery operations.

Built to withstand the worst

InfiniSync appliances are hardened in the true physical meaning of the word. First of all, they are built to sustain prolonged power failures (36 hours) thanks to onboard batteries. If all connectivity fails, they are equipped with broadband LTE connectivity over two redundant & protected LTE antennas.

These would be great attributes if we were only talking about a power loss or connectivity outage. But battle-hardened data center veterans have seen their fair share of disasters and know that water, fire, earthquakes and many other dangers often strike just at the time when they will hurt the most. I vividly remember how a freshly built local data center was specifically placed to avoid any water pipes, only to find one day that after the installation the building plans didn’t account for a water pipe (which wasn’t even known by the landlord) that ran exactly above several server racks. Of course there was eventually a water leakage right onto the rack which had most of the business-critical hardware installed. Ouch.

The InfiniSync appliance is built purposely to endure against adverse conditions. INFINIDAT advised that the appliance survivability characteristics make it able to sustain direct flames up to 2000°F (one hour up to 1733°F / 945°C), prolonged intense heat periods (up to 5 hours at 500°F / 260°C), various environmental shocks related to structural collapse/earthquakes (extreme shocks, drops, piercing, weight crushes), and water submersion.

The InfiniSync solution has been built to ensure that not even the toughest disasters impact data integrity and data replication. Provided mission-critical data has been synchronously replicated between the primary InfiniBox and the InfiniSync appliance, regardless of the data amounts, it’s up to the job.

Conclusion

INFINIDAT’s InfiniSync solution offers efficient, full-spectrum coverage of mission-critical Tier1 workloads against multiple disaster scenarios: it provides coverage against widespread regional disasters as well as rolling disaster scenarios without the cost and complexity of solutions proposed by their competitors.

InfiniSync also provides zero-RPO synchronous replication over any distance, without performance impact, and allows the immediate recovery of applications and systems at secondary locations. These impressive features are made possible thanks to InfiniSnap, INFINIDAT’s core snapshot technology foundation.

Finally, InfiniSync reduces the infrastructure footprint required to operate traditional zero-RPO replication solutions over larger distances by eliminating the need for specific investments (expensive leased lines, bunker sites).

The addition of InfiniSync to INFINIDAT’s portfolio opens the enterprise IT world to unknown before mission-critical synchronous data replication capabilities over very long distances. It is another testament to INFINIDAT’s commitment to innovation. It further consolidates the position of INFINIDAT in the enterprise-class primary storage market and will certainly boost INFINIDAT’s value proposition with the most demanding enterprise customers.

About the author

Max Mortillaro

Massimiliano "Max" Mortillaro is a Partner & Principal Analyst at TECHunplugged.io. He's a former 5-star VMware vExpert, one of the re-founders of the Czech Republic VMware User Group and its former leader. He's an advocate for online security, privacy, encryption and digital rights. Like his name very wrongfully hints, Max is French and lives with his family in Prague, Czech Republic. Besides being a failed sportsman he is a general bon vivant and the impersonation of your average hobbit in full size.

2 Comments

  • To be clear, this is synchronous replication from primary storage to a local sync appliance. That appliance then sends the data to a remote appliance, which then writes to the remote storage. That middle relationship can have many outstanding writes. Synchronous replication usually implies that the production app doesn’t move on until the data is safely in both sites, thus ensuring a true zero RPO. This solution doesn’t do that, as the app can continue while there are still possibly outstanding writes left in the sync appliance. Unless I’m missing something?

    • Correct,

      They are saying if a disaster occurs, you will be able to grab that box out or if you can’t get to it, sync it over LTE or Wifi to a laptop with a big disk. Not quite Synchronous in the true meaning of the word, but if you had the 2nd site in a 10Gb/s loop under 10ms, it would be pretty close behind, and the small delta would probably be able to be recovered or replicated after the disaster.

Leave a Comment