Ocarina Networks presented to us during the Day 2 morning session of the GestaltIT Tech Field Day. Their presentation was a deep dive into storage compression and optimization. If you read my Ideas About Presenting To Engineers from earlier this week, then you’ll know what I mean when I say that Ocarina had “black magic” that wasn’t as interesting to me as how and where the solution was deployed in the data center. After all, Ocarina claims that they will optimize and compress data on any storage device. I wanted to know how they could integrate with existing, third party storage before I was ready to absorb how their compression and de-duplication was actually achieved.
I don’t want to mislead anyone – almost all of my fellow delegates where deeply impressed with the block by block algorithm lesson on Ocarina’s technology. The storage gurus where so into it I had to wait for the hands on labs to start before I could get my questions answered. In fact, I never did the labs because I spent the whole time at the white board understanding the deployment options.
My post is therefore about implementation options of an Ocarina solution. I’m using several quotes from posts by Carter George, co-founder of Ocarina Networks, taken from the Online Storage Optimization Blog to help me explain the deployment options. Then I’ll cover some ideas about how could Ocarina function in a virtual environment.
For those that want to know more about the deep dive on the Ocarina technology, follow the links at the end of this write up to my fellow attendee’s posts.
Disclosure – Ocarina gave me a ceramic ocarina (flute) and Nirvanix has offered me a temporary evaluation of their service.
Ocarina needs an ethernet connection
First of all, Ocarina’s technology works with NFS storage and CIFS. That means ip connectivity must exist between the data shares and the Ocarina server / appliance for 2 of the 3 possible deployment options. Basically, Ocarina will remotely monitor the storage device for select activity it can optimize via the normal management interface. I was told that there was no need for a dedicated or private network.
3 Deployment Methods
As the post Going Native CIFS explains, Ocarina can be deployed in one of three ways:
“Ocarina Inside”: Ocarina is embedded in a NAS vendor’s solution.
Examples of “Ocarina Inside” are EMC Celerra, HP Enterprise NAS, BlueArc, and HDS HNAS. Additional “Ocarina Inside” partners will be announced soon. This is the best form of integration, because it makes deduplication and compression completely transparent to users and applications, and lets each storage vendor deliver all their full value-add, including in the CIFS protocol stack.
Ocarina Appliance: A split-band appliance
In the Ocarina Appliance case, Ocarina’s optimization happens out of the customer data path, but in order to expand files to their original state upon user access, the Ocarina intercepts read requests in-band. If an I/O (over CIFS or NFS) is to an Ocarina-optimized file, we step in, rehydrate the file, and pass it on to the user. This involves being a proxy for NFS and CIFS (and other protocols including WebDAV and http)
Ocarina Native Format Optimization (NFO): files are optimized in their native format
This is a special use of Ocarina where we take certain rich media file types — photos, images and video — and compress them in a special way. What we do is compress them, but have the output be a new, smaller file but in the same native format it started out in. We’ll take a JPEG photo, compress it, and produce as output another perfectly formed JPEG photo….just smaller. The same is true for example for Flash videos. Now in this case, there is no need for a decompressor or for Ocarina to be in the read path or on the protocol at all. We can read files from your NetApp, shrink them, write them back on to your NetApp and Ocarina need not be involved at all when users or applications go to access those files.”
Virtual Machines and Ocarina
I got the following information from the post titled Dedupe Misconceptions. The post specifically references VMware virtual machines, but the scenario can be easily imagined for any virtualization solution.
In a virtual machine environment, a storage array may be storing thousands of VMDK’s, the VMware files that store a given virtual machine. Inside each VMDK file is a complete virtual machine image, including the operating system, application files and user data. If you have 1,000 VMDK’s that holds virtual Windows machine, you’ll have tens of thousands of “files” inside that VMDK file, including a copy of Microsoft Windows, the application you are running the in the virtual machine, and often the data for that application as well. How much of the Windows operating system do you suppose is duplicated across the 1,000 VMDK’s in this example? Well, almost all of it. What’s more, the thousands of files that make up Windows do not change – are not changeable, in fact, unless you do an OS upgrade.
Large parts of the VMDK file are duplicate with others, and they stay the same, day after day. Perfect candidates for dedupe. Sure, the user data in a VMDK may change, but any competent dedupe solution is not deduplicating whole files – the dedupe solution is deduplicating something at sub-file granularity: blocks, objects, chunks, etc.
VI Design Ideas for Ocarina
Unfortunately, Ocarina cannot optimize primary storage for live VMs. This is not to say there is not a place in a virtual environment for the technology.
When I asked Ocarina how they felt they fit best with virtualization, It was suggested that VMs that remain powered off could be moved to a secondary NFS LUN / datastore that was compressed. Before these VMs could be powered on they would be migrated back to the primary datastores not optimized by Ocarina. At a high level, this seems like a great way to maintain backup clones of production VMs without provisioning twice as much storage. To that point, we also listened to a Nirvanix Enterprise Cloud Storage presentation while at Ocarina’s office.
Then thinking about other virtualization scenarios for Ocarnia, I considered how significant storage savings could be achieved for file servers, document management, interior design drawings, photography and film repositories, or any other applications that manages a lot of files. In order to consolidate hardware the server operating system could be encapsulated as a virtual disk, but the application data could be separated to a dedicated CIFS or NFS share. I’ve even posted before about leveraging the built in capability of NetApp filers to act as a network CIFS share. I mention NetApp specifically in my post, but remember, Ocarina works with any storage device. Ocarina would be optimizing the application data in these scenarios and not touching the virtualized pieces.
For those that want to know the deep Ocarina storage details here are several links to other Tech Field Day blog posts on Ocarina:
- www.techhead.co.uk -videos of CTO Goutham Rao
- rodos.haywood.org– more video of Goutham and a lot of detail about the Ocarina compression and de-dupe
- storagemonkees.com – a great summary of the Ocarina session with a ton of storage insight
- renegade.tweakblogs.net – Technology details and then more details in the comments
- blogs.techrepublic.com.com/datacenter – good overview of the Ocarina compression and dedupe