All Tech Field Day Events

Querying Data at Source with Cribl Search

When needing to find information about something in our daily lives, we turn to search engines. Likewise, in IT, to query data in distributed environments, professionals turn to search functions.

Data searchability is a complex subject thick with less understood matters like query language and code. As data has grown in mass and volume, the quality of query has declined on one hand, while the cost of search has escalated on the other hand. David Cavuto, Director of Product Management, Cribl, draws attention to this, while presenting at the recent Security Field Day event.

Paving a Change

Over a career spanning 25 years in security and data analytics, Cavuto has seen the emergence of a growing consensus that there needs to be a way to mine data for security investigation that is flexible and budget-friendly.

“I’ve been back and forth between vendors and implementers because my experience in being a security professional lends itself to being able to come back and talk to the actual products and say this is probably how we could do it,” said Cavuto.

It didn’t take Cavuto long to realize that the industry needs a simple, but powerful tool to get the best answers out of their data.

Historically querying data requires datasets to be copied from their original location to a local destination, a practice that has pumped up the total data volume and ratcheted up storage costs disproportionately, while causing higher query latency.

“We’re in the space in tech where we’ve a lot of infrastructure available publicly – public cloud, SaaS, etc. We’ve compute available that we can rent and lease. The question is why should we have to centralize data, bring it somewhere else?” he asks.

A Growing Data Problem Is Affecting Enterprises around the World

Reports published by IDC predict that by 2025, the total amount of data will cross 175ZB. Organizations are already beginning to feel the weight of data on their processes and IT budgets, and having multiple copies of the same data on their hands is not helping any.

Cavuto pointed out that while on one hand data sprawl has piled up analytics teams’ plates high with workload, on the other hand, it has caused an inevitable surge in complexity. To keep up, enterprises are accruing numerous monitoring and analytics tools, with 54% enterprises using more than 6 monitoring solutions at once. Now sitting on a glut of tools, they are constantly combatting a growing complexity problem that is trying to choke wealth-generating productivity and cause revenues to plummet.

Unless organizations come up with an efficient way to harvest all of the data they accumulate daily, spending will soon outweigh the merits of data analytics. And with companies racing to harvest every morsel of data on them that might benefit the business, a search tool that reveals information in the fastest and most efficient way possible is in immediate order.

Search in Place for Better Query Results

Some years back, the team at Cribl got on a mission to design a search tool that instantly provides comprehensive answers to complex questions at minimum effort, time and cost.

Most data analytics tools collect data, store it a specialized storage and lets users search it. That is inherently disadvantageous. “There’s time involved in that, as well as infrastructure and cost,” notes Cavuto.

Cribl came up with a different kind of tool. Cribl Search allows users to search data at source, no duplication of massive datasets required. “It’s the product that I’ve always wanted to have. If we had this 20 years ago, we could have done so many different things,” says Cavuto.

Integrated as a service within the Cribl Suite, Cribl Search unlocks native data query for all locations. The challenge it addresses is to leverage data and compute where they are without data ingestion. The product bypasses centralizing or moving huge petabytes of data between cloud platforms, and presents a search function that can work with smaller datasets and works equally for all platforms.

The Cribl Search Interface

Cribl bypasses a major limitation of native search functions in cloud. “The biggest trouble with throwing data into S3 is that it’s opaque, a black box. If you want to ask your questions, you have to write code.” Using Cribl, users can make sense of data with common search terms.

Cribl supports all major cloud platforms – AWS, Azure and GCP. Users can ask questions to data living in any of these locations as though it is in their private analytics systems on premises.

What sets Cribl Search apart is flashing fast question-and-answer queries. Ask a question, and it gives back answers in seconds.

Cribl Search boasts of a federated, distributed model, meaning users can push a query to multiple systems simultaneously. The search path for a single query can follow an array of systems and come back as a single result set. With compute and data executed locally, query speed is maximum.

Cavuto emphasizes, “With Cribl Search, you ask one question – the search gets dispatched to each one of the providers and what comes back is a single unified federated set of results without any data movement whatsoever.”

This is particularly useful for edge use cases that involve large-size deployments across thousands of endpoints. Through Cribl edge agents sitting in every edge endpoint, it is possible to dispatch a search through any number of devices without having to centralize or store data locally.

Many Provides, One Langauge

Cribl Search is a single language interface that uses Microsoft Kusto Query Language (KQL), a declarative language that is easy to use and extend. The tool translates data coming in from disparate cloud interfaces into one platform-specific syntax for a simpler querying, eliminating the need to go to independent providers for querying.

If you’re wondering what it costs, with Cribl Search, users only pay for the search function which is the cost of the CPU processing the query, and no more. So egress fees can be avoided for the most part.

“We see this as the future of data analytics – massively parallel, massively federated,” says Cavuto.

Wrapping Up

The future of data is intertwined with data searchability. Complex, subjective queries will remain integral to extracting value out of data. The biggest incentive of using Cribl Search, especially for security investigations, is that it lets organizations get value out of their data without losing top dollars. It presents a sustainable way to search a subset of data bypassing the unprofitable and sloppy method of making copies in bulk and taking up more than necessary storage space for each and every query. Whether data is in public or private infrastructure, this may be the default choice of users.

For more information, watch the full presentation of Cribl Search from the recent Security Field Day event.

About the author

Sulagna Saha

Sulagna Saha is a writer at Gestalt IT where she covers all the latest in enterprise IT. She has written widely on miscellaneous topics. On gestaltit.com she writes about the hottest technologies in Cloud, AI, Security and sundry.

A writer by day and reader by night, Sulagna can be found busy with a book or browsing through a bookstore in her free time. She also likes cooking fancy things on leisurely weekends. Traveling and movies are other things high on her list of passions. Sulagna works out of the Gestalt IT office in Hudson, Ohio.

Leave a Comment