One of the topics I’ve often written and spoken about is thin provisioning. This series of 11 articles is an edited version of my thin provisioning presentation from Interop New York 2010. I hope you enjoy it!
I began by introducing the core problem: Storage isn’t getting any cheaper due to storage utilization and provisioning problems. Thin provisioning isn’t all it’s cracked up to be, but it has a place. So what’s wrong with it?
The problem with thin provisioning starts with the telephone game. Did you ever play the telephone game as a kid? Maybe you whisper “I like potatoes” to the first person in a circle and when it comes back to you it’s changed to “Meet Mike on the patio” or something like that. It’s a totally different message.
What happens in the telephone game is that a little bit of information gets lost at each step along the path, and at the end of the chain you’ve basically lost all the information. And this happens all the time in computers, especially in data storage.
We storage guys are stuck at the bottom of a stack that includes many layers. Each of those layers loses something in translation, mostly because it’s pretending to be something that it’s not.
Think about storage today: We’ve got fake file systems pretending to live on fake discs that pretend to be directly connected to your computer. But they’re not.
Everything we do in enterprise storage is basically faking out something else so compatibility is maintained. And at each step (the file system, the database, the host, the network) you’re losing information. By the time you get to the storage system, there’s just no communication whatsoever.
This is the core problem with thin provisioning. The application knows what data is temporary, and that would be very useful for the storage system to act upon. But by the time the data gets there, the message is lost. Maybe the application tells a database. Maybe the database tells the file system. Maybe the file system tells the volume manager. But, that’s about as far is it’s going to go. So, this is really the issue. It’s the telephone game.
Let’s say we have a file system and some storage. We want to write some data. So, the file system says, “Hey, here’s my new block.” And the storage says, “Yeah, I got it.”
This is the classic way of doing thin provisioning. So the storage system is now only using the little blue box. The file system adds some new data, then some more data, and the storage just keeps growing. The rest of the space can be reallocated.
We’re good so far. This is so simple that most products in storage now have something like this. Of course, it took them 10 years to do it, but they finally have it.
So, we’re good. We can allocate storage. But, what about deallocate?
If I delete something, I have to tell the storage, and then it has to shrink the allocated capacity. But we’re not doing that. Most file systems don’t actually send that information on. When you delete a file, most file systems actually write more data instead of actually deleting anything.
Thin reclamation is the core technical challenge to thin provisioning, and the telephone game is the reason. Next we’ll present some solutions that are currently being worked out.