- Pure Storage – You’ve Come A Long Way
- A Conversation with Jason Nadeau
- Discussing FlashArray//X and AIRI Mini with Matt Kixmoeller
- //X Gon Give it to Ya
- Green is the New Black
- The Case for Data Protection with FlashBlade
- Harnessing the Power of Solid State
- What Did We Learn from the Flash Memory Summit 2018?
- Pure Storage and VMworld US 2018: What I Expect
- How a Storage Company Approaches Containers
- Pure Storage and the State of VVols
- Pure Storage Announces the “Data Hub”
- Pure Storage Gets Cloudier
- Pure Storage Isn’t About All-Flash Anymore (and Never Really Was)
- Let’s Take a Look at Pure Storage StorReduce
If you had a chance to read my previous article about Pure Storage, you already know that Pure has recently expanded its product line-up with a set of products aimed at helping its customers bridge the data management gap between on-premises and public cloud.
In case you missed it, Pure has launched three products:
- Cloud Block Store: powered by Purity, the same OS that runs its FlashArray physical arrays. Cloud Block Store enables end users to get the same functionalities of a physical array but on AWS. Data footprint optimization, remote replication, snapshots and management tools and all the other features you usually get from a Pure FlashArray. Performance and availability can’t be the same of course, limited by resources available in AWS, but you can take advantage of the aforementioned features to replicate data across availability zones and other nice tricks that will be able to perform a lot of magic.
- CloudSnap: a feature that was already available, allowing to offload snapshots from a FlashArray to less expensive repositories like a FlashBlade or other NFS shares, now supports S3 too. This feature is quite powerful because the snapshots, which are also deduplicated and self-describing, can be used for long term retention, DR and many other use cases.
- StorReduce: Acquired earlier this year, and now available in beta for Pure’s customers, StorReduce is an object storage deduplication engine. It is designed to optimize space and bandwidth consumption for S3, allowing to pay for less capacity and limit egress costs.
I’ve already covered the first two in the post I mentioned, but I think StorReduce deserves some space for an analysis on its own.
Scenario and Challenges
Cloud is cool, and cloud storage even more so. But all the flexibility, unlimited scalability, durability, availability and accessibility come with a cost, and this is particularly true for services like Amazon S3.
I’m sure you’ve read about cloud wars many times, and how cheap cloud storage is now. Price per GB has been shrinking since forever and looks very compelling, with offers like Glacier that are now as low as 0.004$ per GB per month. The problem is that price per GB is only one component of the total cost for storing data in the cloud which, in reality, could be very high and somewhat unpredictable.
The real cost of AWS S3 is explained here. The price per GB is only the first line of the bill, then you have:
- How much data you read back
- How much data you transfer out of AWS
- Acceleration (caching at the edge, or CDN if you like), on top of what you pay on transfers
- Cross-region replication is an extra too
- Then there is object tagging, inventory and analysis
- ILM (data transfers from tier to tier, aka lifecycle management)
At the end of the day, this could easily become a billing nightmare, especially if you didn’t plan in advance or you have applications that are not really optimized to move the least amount of data possible.
StorReduce to the Rescue!
StorReduce is a simple, yet amazing, product. It sits in front of an AWS S3 compatible storage with the intent of deduplicating all data that comes in, seamlessly, like a sort of gateway. Furthermore, with a second instance deployed remotely, all the traffic between the two is reduced, and performance improved.
It’s pretty obvious that StorReduce, thanks to data footprint optimizations, has 3 major benefits:
- Less capacity utilized on Amazon AWS S3. Meaning better cost for storing data on S3 or, if you prefer you can translate it with a better price per GB for the same amount of data.
- Fewer data transfers?back and forth. Again, this leads to savings especially if you plan to access your data often from outside AWS S3.
- More performance. Data movements across clouds are optimized and you can get data faster thanks to deduplication which transfers identical blocks only once over the internet.
With this solution, you still have a central repository with all the benefits of a public cloud object store (scalability, flexibility, durability and global accessibility) but without the strings that come with it, like high and unpredictable costs or, better, simply at a lower cost because of the optimization.
There are several use cases for this technology including backup, disaster recovery and all those cases where data mobility based on S3 protocols is important.
Closing the Circle
CloudStore is an interesting technology, which might not be for everybody but it’s definitely good to have it on top of other Pure Storage Cloud Data Product and Services.
The trio, Cloud Block Store, CloudSnap, and StorReduce, helps Pure Storage to offer its customers an expanded vision and broader set of use cases when approaching hybrid- and Multi-cloud storage. StorReduce is in beta now and, as mentioned in my previous article, I hope this product will soon be available on Microsoft and Google clouds, even if I understand that AWS is the low hanging fruit for everybody.