All Utilizing Tech

Dark Data at the Edge | Utilizing Tech 05×13

In this episode of the Utilizing Tech podcast, Stephen Foskett, Allyson Klein, and Gina Rosenthal discuss dark data in edge computing. Dark data is unutilized or unknown data collected by organizations. The distributed nature and use of third-party apps can make it challenging to handle dark data, limiting insights and posing security risks. Establishing a stronger IT-business connection is crucial. Observability solutions and data analytics can aid in discovering and centralizing dark data. AI has potential for data hygiene improvement, but human-driven cleaning is still necessary. Despite challenges, edge computing offers better data management due to controlled deployments.

Navigating the Challenges of Dark Data in Edge Computing

In this episode, the concept of “dark data” in the context of edge computing is thoroughly explored, shedding light on a significant challenge faced by organizations in the ever-evolving digital landscape. Dark data refers to the vast amounts of data collected by organizations that remain unnoticed, underutilized, or entirely forgotten. The distributed nature of edge environments, coupled with the rapid proliferation of third-party applications, contributes to the complexity of handling dark data effectively. The discussion highlights the critical role of intentional data management, along with the aid of observability solutions and data analytics tools, in discovering and centralizing dark data. However, the conversation also delves into the potential limitations and intricacies of identifying and managing dark data in the dynamic edge computing ecosystem.

One of the primary challenges discussed in the podcast is the intentional optimization at the edge, where data is processed and filtered before being sent back to the central repository. While this approach serves to reduce bandwidth usage and optimize connectivity, it can also lead to potential data loss. The hosts raise thought-provoking questions about where to draw the line between dark data and data hoarding, emphasizing the need for thoughtful data policies in IT organizations. Striking a balance between retaining valuable data and avoiding unnecessary data accumulation is crucial to ensure effective data utilization and to mitigate security and privacy risks associated with unmanaged data.

The episode examines the intricacies of managing dark data in the context of edge computing, where data is generated, processed, and stored closer to the source, often across various applications and departments. The challenge of understanding the full scope of data generated in such a distributed environment requires a collaborative approach involving not only the IT department but also legal, security, and business stakeholders. It is essential for organizations to collectively make informed decisions about data retention and utilization, considering both the potential value and the potential risks.

The discussion also explores the potential role of artificial intelligence (AI) in managing dark data. While AI holds promise in discovering valuable insights from data, the hosts caution against relying solely on AI algorithms for data cleaning. Poor data quality can significantly impact the effectiveness of AI-driven analysis, making data hygiene an essential foundation for deriving meaningful insights. One suggestions is to apply data cleaning before AI algorithms to extract meaningful information from the data.

Despite the complexities involved in managing dark data in the dynamic edge landscape, the conversation concludes with an optimistic note. The hosts express hope that edge computing, with its specific purpose and controlled deployments, may offer unique opportunities for better data management. Organizations deploying edge applications have a better grasp of the data they collect and process, potentially leading to more informed data management practices.

Podcast Information

For your weekly dose of Utilizing Edge, subscribe to our podcast on your favorite podcast app through Anchor FM and check out more Utilizing Tech podcast episodes on the dedicated website,

About the author

Stephen Foskett

Stephen Foskett is an active participant in the world of enterprise information technology, currently focusing on enterprise storage, server virtualization, networking, and cloud computing. He organizes the popular Tech Field Day event series for Gestalt IT and runs Foskett Services. A long-time voice in the storage industry, Stephen has authored numerous articles for industry publications, and is a popular presenter at industry events. He can be found online at,, and on Twitter at @SFoskett.

Leave a Comment