• Cloud
  • Mobility
  • Networking
  • Server
  • Storage

Gestalt IT

Independent Experts United

  • Exclusives
  • Podcast
  • Gestalt News
  • Tech Talks
  • Favorites
  • Services
  • Events

Data Hoarders of the World

December 29, 2014 by Justin Warren 1 Comment

This is post 3 of 9 in the series “NexGen Storage ioControl Tech Talks”

  1. Evolving Storage From Pets To Cattle
  2. What We Talk About When We Talk About Storage
  3. Data Hoarders of the World
  4. Learn to Love The Data Not The Hardware
  5. Data Is A Four-Letter Word
  6. How To Classify Data
  7. End-to-End Data Management
  8. The Battle For Your Data Center’s Brain
  9. Panning For Data Gold

A series of tech talks about NexGen Storage and their ioControl flash storage datacenter solutions

“Hoarding disorder is typified by persistent difficulties discarding possessions, resulting in significant clutter that obstructs the individual’s living environment and produces considerable functional impairment.”[i]

Big data is an astoundingly powerful thing, if you believe the breathless boosterism emanating from some corners of the tech-oriented media. For the first time in human history, merely collecting a lot of information about things is suddenly going to result in solutions to what have so far proved stubbornly intractable problems: corruption, disease, world peace! If you simply collect a lot of data, magic algorithms will find solutions in there somewhere. Big computer with complex maths. Possibly some quantum. Isn’t it all terribly exciting?

And how do we collect all this data? Easy! Just keep everything! Like filling your house with old newspapers, the big data enthusiasts advocate buying more and more storage systems and just keeping all the data you generate, the more the better.

“This could come in handy some day,” they say, while you start renting storage space because your wardrobe is full of clothes you never wear storage is full of data you never look at. But you might. One day.

Because if you don’t keep it, it’s gone forever. Better be safe and just keep it, because one day you’ll get big advantage over your competitors by virtue of your superior ability to buy exactly the same online data analysis tools as anyone else.

My point, subtle though it is, is that this is all nonsense. As the old joke goes, it’s not how big it is, but what you do with it that matters.

Data analysis is hardly new; it is the basic underpinning of all science. Standard deviation (or mean error, as used by Gauss) has been known about for some two hundred years.[ii] What is new is the ability for organisations to make use of more sophisticated tools than they used to. You can download Optical Character Recognition software for free, and voice-recognition is built into your smartphone. Google has been translating the written word for years now. This was the stuff of science fiction when this author was a child, and now it’s common-place.

Everything Old is New Again

The hand-held calculator replaced the slide-rule because it’s much easier and faster to use. Calculating logarithms used to be done in advance, and you’d look them up in a table. People used to write their financial accounts in physical ledger books (hence books of account), but now we use computers. Where would we be without the spreadsheet? Did people really use pen and paper like uncultured savages?

These new data analysis techniques are just another step on the journey of increasingly complex tool use our species has been on since we first discovered The Stick. But while a stick can be applied in a multitude of situations quite successfully, a laser sintering machine tends to apply to a more restricted set of problems.

A chainsaw is a much more powerful tool than a knife, but using one to carve a turkey is problematic. Keyhole surgery would also be ill-advised. The trick is knowing which tool to use in which situation. Having a lot of data doesn’t make problems easier to solve any more than having a shed full of tools makes you a master craftsman. Which data analysis tool should you use, and on which data? How do you decide?

Simply keeping everything actually makes your life more difficult, because when it comes time to use the analysis tools, what do you point them at? A data-centre full of animated gifs and cat memes?

Signal in the Noise

Storing everything is impractical simply because there isn’t enough physical storage being manufactured to store it all, and as more data is being generated by more devices, this situation is getting worse, not better.[iii] And most of the data generated is actually noise, which is why you need these sophisticated tools (and the modern, extremely powerful CPUs to run them) to sort through it all to find meaning.

As Nate Silver, founder of FiveThityEight and author of The Signal and The Noise, said in May 2014, “Understanding a more limited information set trumps misunderstanding a gigantic information set.”[iv]

The risk of acting on a spurious correlation is very real. Did you know the divorce rate in Maine correlates with the per-capita consumption of margarine in the USA? One Australian retailer spent a lot of time and money to re-discover the Earth-shattering fact that people like chocolate.

Not all data is equally valuable, and having lots of data is no substitute for knowing what you’re doing.


[i]
Nordsletten, AE et al. (2013). ‘Epidemiology Of Hoarding Disorder’. The British Journal of Psychiatry, p. bjp.bp.113.130195.

[ii]    Anon. ‘Earliest Known Uses Of Some Of The Words Of Mathematics (M)’., accessed December 19, 2014, from <http://jeff560.tripod.com/m.html>.

[iii]   Vernon Turner, David Reinsel, John F. Gantz & Stephen Minton. (2014). ‘The Digital Universe Of Opportunities: Rich Data And The Increasing Value Of The Internet Of Things’., accessed December 19, 2014, from <http://idcdocserv.com/1678>.

[iv]   Voss, J & CFA. ‘Nate Silver: “Buying Big Data To Solve Problems Is Oversold”.’ CFA Institute Annual Conference, accessed December 19, 2014, from <http://annual.cfainstitute.org/2014/05/16/nate-silver-buying-big-data-to-solve-problems-is-oversold/>.


Justin-WarrenAbout The Author

Justin Warren is an Australian MBA who writes and speaks extensively about the intersection of IT and marketing and how advances in technology are changing both.  His blog can be found at http://eigenmagic.com and followed on Twitter as @JPWarren.


This post is part of the NexGen Storage ioControl sponsored Tech Talk series.  For more information on this topic, please see the rest of the series HERE.  To learn more about NexGen’s Architecture, please visit http://nexgenstorage.com.

  • About the Author
  • Latest Posts

About Justin Warren

  • Panning For Data Gold - May 25, 2015
  • How To Classify Data - April 13, 2015
  • Data Hoarders of the World - December 29, 2014

You might also be interested in...

  • Panning For Data Gold
  • DataGravity — First Look
  • Drill Baby, Drill! (into NetFlow with Kentik)
  • Big Data is Coming After Your Data Center And That’s a…
  • How To Classify Data

Filed Under: Exclusive, NexGen, NexGen Storage ioControl, Tech Talks Tagged With: #ioControl, @JPWarren, @NexGenStorage

Comments

  1. cmccall says

    January 21, 2015 at 12:04 AM

    Just because you can, doesn’t mean you should. It’s really interesting to see the realization of legal departments that storing data forever is creating risk for corporations that outweigh the value of keeping it. There will be a significant shift toward understanding not only how valuable data is to an enterprise but how much risk it creates. Data Hoarders = Corporate Lawyer nightmare.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

What Are Tech Talks?

Tech Talks are sponsored conversations between industry leading analysts and influential companies.

These posts explore the pressing issues in IT, examine fascinating use cases, and facilitate larger conversations.

To start a Tech Talk contact Rich Stroffolino.

More Exclusive

Droplet Computing: Living La Vida Legacy

Trends – IT Origins Survey

Qualcomm’s Terrible No Good Very Bad Acquisition | Gestalt IT Rundown: April 18, 2018

Detecting Cryptocurrency Mining with Vectra Cognito

Sonia Cuff – IT Origins

More Tech Talks

Gaining Escape Velocity from Vendor Lock-In with Neutrix Cloud

Verify, Or Die Trying: Observations on Change Management

InfiniGuard – Enterprise-Class Data Protection at Petabyte Scale

Succeeding With SaaS and Viptela Cloud On-Ramp

INFINIDAT InfiniSync – A World of Infinite Possibilities in Zero RPO Synchronous Replication

Gestalt – (noun) an organized whole that is perceived as more than the sum of its parts.

Categories

  • Exclusives
  • Podcast
  • Gestalt News
  • Tech Talks
  • Favorites
  • Services
  • Events

Topics

  • Cloud
  • Mobility
  • Networking
  • Server
  • Storage

The Socials

  • View GestaltIT’s profile on Facebook
  • View GestaltIT’s profile on Twitter
  • View Gestalt_IT’s profile on Instagram
  • View isaHnBrJzPtxd5PcCOoSSw’s profile on YouTube

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

Editors: Stephen Foskett, Tom Hollingsworth, Rich Stroffolino

Copyright © 2018 · News Pro Theme on Genesis Framework · WordPress · Log in