Data is data. We collect it. We store it. We run analytics against it and use it to provide information to decision makers. But no matter what we do with the data, it stays inviolate. If the temperature of your data center on January 10th at 6pm was 70 degrees that discreet data point doesn’t change. But, if we collect a lot of data points and see the temperature in the data center start rising after 8pm on the same day that analysis of data points can be used to provide information to a person.
In order to have good information, you have to have good data. But, as above, data is data, right? Not always. Some data is better than other data. And some other data is just waiting to be made good for our tools. Maybe those temperature sensors need to be recalibrated. Maybe the connection to those sensors is slow. There are a ton of reasons why data isn’t arriving in a format you expect to see.
Some companies try to fight this process by adjusting the data after it is received by the applications using it. This is usually a job performed by data scientists. They’re very talented people that can do a lot to help organizations have good data. But they’re also expensive assets that are better used in other ways than just scrubbing input data to things like monitoring systems. So how are we supposed to get good clean data without having an army of data scientists at the ready to fix our problems?
A Day At The Lake With ScienceLogic
One company trying to fight back against the data problem is ScienceLogic. During Cisco Live US 2019 I had a chance to sit down with Peter Luff, Senior Director of Product Marketing for ScienceLogic. Since May of last year they’ve been shipping their SL1 AIOps platform, a product designed to replicate the efforts that data scientists have been doing to ensure that the data being used for analysis is clean.
One way that ScienceLogic does this with AIOps is to build a data lake in the middle of their solution. They feed in logs and time-series data from a variety of sources to start building context around that data, and not just networking devices, either. They collect data from storage arrays, virtual machines, and many more locations. Because each source of good, clean data is crucial for building a system to analyze the system as a whole and give you a heads up about what’s going on when things aren’t going as planned.
Because AIOps is applying machine learning principles to the data lake, they can start to extract data for important trends. For example, what happens when you know that your retail site is going to get a big influx of traffic on Black Friday? You have two options. The first is what we do now — guesstimate. This solution is fraught with challenges. If you guess that you need to increase your amount of server and network resources and guess too small, you’re going to have an outage and unhappy customers. However, if you guesstimate in the other direction and have too many resources available you will have happy customers but also very happy service providers that are charging you a lot of extra money for those additional services you purchased and didn’t use.
Instead, using AIOps, you can choose a different path — educated guessing. You can analyze the data using an integration with IBM Watson to provide capacity planning projections. Now, instead of just hoping that you have the right amount of resources for Black Friday you will have a high confidence that you’ll have the best setup to prevent unhappy customers and big hosting bills at the end of the month. And that kind of predictability makes everyone inside the organization very happy.
Bringing It All Together
Analytics and data science can be a dirty job. Everything you can do to ensure your data is clean and trustworthy at every step of the process goes a long way towards keeping everything working quickly and easily. And when you start making predictions based on that data, you need to be sure that you’re not going to get an output that causes concern because it’s out-of-bounds from what you were expecting. ScienceLogic has done a ton of work ensuring that their AIOps platform gives you a source of good clean data and then uses it to help you do your job across your enterprise.