I don’t know who said it first, but I know I heard it from Keith Townsend more than a few times: data is the new oil*. The metaphor stems from viewing the dominance of the giant hyperscale companies like Amazon, Facebook, and Google in a similar lens as the Standard Oil trust. This posits that a small number of firms have found ways to exploit data to reap massive profits, with seemingly no competition in sight. But as a framing device for the value of data, it has some issues.
Data as a Renewable Resource
Obviously, data is not a limited resource like oil. While Standard Oil never had to worry about this issue at it’s height in the nineteenth and early twentieth century, the fact remains that oil is a geographically bound and finite resource in terms of acquisition. Data has no such restriction. Indeed, we would not see nearly as many firms enriching themselves from data if it was. Instead, it is often the same (or very similar) data that these companies use.
Data may not be unlimited at any one time, pedants could argue that at a given instance, there is a limited set of data points in existence. But as a presence through time, data has a renewable quality. Any time humans, algorithms, or machines engage with anything there is the possibility for data to be generated. Or to borrow the parlance of physics, there exists in matter potential data waiting to be extracted. I suppose if we took a long enough view, anything carbon-based has the potential to become oil given time and the right conditions, but the process of “datafication” is much more apparent.
Limiting Data as a Business Model?
And if data is the new oil, how could we then see firms moving away from data collection as a business model? Consider that Apple, perhaps the most valuable company in the world, is extremely hesitant to exploit their potential data. Their market is in devices and services, they are consciously stemming the flow of data back to themselves as part of their business model. It is a rare oil company indeed that would market differential extraction as a product.
The value of oil is primarily one of scarcity. Some places have oil, others have no access, and this can (and does) create financial and political imbalances. This is not to say a data scarcity couldn’t exist. As more and more businesses depend on the firehouse of data from Facebook and Google, a decision to cut off that flow would have massive repercussions. But no one is saying that we’re drowning in oil (alright perhaps in an environmental context), that there is so much oil we can’t possibly handle it. But that’s exactly what people are saying about data. The clarion call modern IT is that data is not only valuable in its own right, but threatens to drown any organization that can’t handle its exponentially rising tide.
While it may not be as popular, I think it’s more apt to say that data is the new solar power for the modern enterprise.
Much like solar, data storage and retrieval remains a major pain point for organizations. Despite the growth of both speed and capacity with all-flash arrays, we’re still awash in data that makes it challenging to actually analyze it fast enough for it to be useful in many applications. This storage-battery issue isn’t enough to nullify the basic usefulness of either, but does limit the application until it is substantially refined.
As part of this storage issue, efficiency of data also plagues nascent efforts. Recognizing that data has a value in and of itself has made it imperative to collect all data, whether there is an immediate business need or not. This adds to both the perception of drowning in data, and the very real problems of storing and retrieving it when it becomes applicable. This might not be as much of a problem for a hyperscaler like Google, but for more mundane organization is a major hurdle.
Finally, much like claims of solar power in the past, many organizations feel like they’ve been sold a bill of goods with the supposed “data economy”, only to be left wanting for immediate return. For specific industries and organizational needs, prioritizing around data makes obvious sense. But much like me putting solar panels on the roof of my house in Cleveland, others won’t see the same kind of returns.
In the end, neither metaphor is perfect, and fail to encapsulate the complexities of data’s value to the enterprise at this particular moment. With the dominance of Facebook, Google, and Amazon, it’s easy to want to view those as the modern oil barons. And maybe if we situate those companies in the post-Standard Oil trust breakup, it might be more appropriate. But I think comparing data to solar brings us to the heart that we’ve only identified it as a potential source of value. We haven’t fully realized exactly how to bring that value about in many instances. If data is the new oil, it’s oil just before the explosion of automotive and geopolitical implications. But given it’s seemingly endless abundance, solar feels closer to home.
*Update: After discussing with Keith Townsend, the original “Data is the new oil” metaphor initially came from The Economist article we linked to in the piece.
- Palo Alto Networks To Acquire CloudGenix | Gestalt IT Rundown: April 1, 2020 - April 1, 2020
- Two Petaflop-Scale Approaches to COVID-19 Research - March 30, 2020
- O’Reilly Closes Event Business | Gestalt IT Rundown: March 25, 2020 - March 25, 2020
- A Tale of Two Data Visualizations: Ransomware Edition - March 23, 2020
- Virtual Events, Real Challenges - March 19, 2020
- GitHub Acquires npm | Gestalt IT Rundown: March 18, 2020 - March 18, 2020
- The Storage Lessons from Westworld - March 12, 2020
- It’s x86 Vulnerability Week | Gestalt IT Rundown: March 11, 2020 - March 11, 2020
- Ampere Altra Brings 80 Cores to Servers - March 5, 2020
- The Enterprise Response to COVID-19 | Gestalt IT Rundown: March 4, 2020 - March 4, 2020