What does your operations team do in your organization? Keep the lights on? Work all the trouble tickets? Or are they on the verge of being rolled into other teams that plan the architecture or develop the solutions that are put in front of your users? Would you even know what those ops teams are currently tasked with if you had to do their job for a week?
It sounds a bit pedantic but it’s not an inaccurate statement. The average enterprise employee outside of IT isn’t really sure what operations teams do. They can tell you what the building maintenance people are up to or what the catering delivery people do if you work in an organization that brings in lunches every day. But when you ask about IT operations, the common answer is a simple shrug and maybe a passing reference to “computer stuff” or something like that.
The truth is that most IT operations teams aren’t surfing the wave of the future with DevOps enablement or weaving code in thin air like some kind of VR-enhanced Minority Report fever dream. It’s nothing quite so magical, in fact. The operations team is probably busy closing the multitude of tickets that get generated every day. And by now those teams are very good at doing it.
The majority of incidents that are logged in systems are very repetitive and take around 90 minutes each to resolve. Multiply that by the number of users in your organization and the average number of tickets they open on a weekly basis and you can see that the operations team spends most of their time treading water and doing the same things over and over again. Those kinds of things have a potential impact on your reliability as well. If the operations teams are busy closing tickets for password resets or application issues they aren’t focused on the up/down status of things in the organization. In fact, the odds are good that those DevOps wizards are more focused on those observability metrics to ensure they catch the issues before they crop up in order to keep things chugging along smoothly.
Moving Operations Into The Future
Now that we know what operations teams do, how can we help? What is available to make their jobs easier? The biggest first step would be to automate those repetitive tasks that they have to do over and over again every week. Systems automation has this aura of being something grandiose and complicated that must be rolled out in phases to be successful and will take jobs away from people.
Reality is more focused on the fact that automating the easy repetitive tasks has a much bigger influence on how automation is adopted in an organization. Automating your network design and engineering roles is a massive undertaking. Automating password resets is something much more achievable and has an immediate impact on the productivity of your operations teams and your system reliability.
You need the right system to collect the data about your periodic trouble tickets and you need the right solution to help you automate those responses into something that your operations teams can rely upon. I had the chance to talk to a company that is doing just that recently and learn more about their perspective.
Shoreline.io was founded by a group of cloud rock stars. They have come from Amazon and Oracle and they have a mission: to automate the tickets that swamp the operations teams in order to improve system reliability. They showed off a case study from AWS where their automation system increased the availability of the system by simply automating the response to common repeatable tickets. How did they do that?
Shoreline worked from a very simple premise. The number of tickets coming into the system was likely going to be a constant. No amount of improved technology or increased automation is going to reduce the amount of ticket generation. Instead, the key is to automate the response to those tickets. Shoreline is able to query across systems and store data to analyze for solutions to common issues. They can do this rapidly in real-time because they collect and compress data on the fly with some impressive algorithms. How impressive? They’re looking at a 30x compression ratio with around 99.5% accuracy of the stats. That means they can collect and store data quickly and ensure it is accurate.
That speed means they are able to make decisions quickly enough to avoid problem lag. Lag in your solution is an issue because you may solve a problem that doesn’t exist when you are able to put the fix in place. Worse yet, your fix may end up causing other issues that need to be resolved. This back-and-forth correction of issues isn’t all that different from a game of Whack-a-Mole. However, the result isn’t a prize, but instead instability in your enterprise.
Shoreline can automate these responses for rapid deployment. Right now they’re focused on pushing these solutions through the GUI and CLI of the devices they interact with. They’re already looking at deploying API-drive fixes down the road but right now the operations teams are more likely to be clicking and typing before they’re programming and deploying. Keeping in line with the idea of solving the easy problems today, Shoreline is tracking exactly with the way the operations teams solve those tickets. Shoreline just does it fast, more accurately, and every time you ask it instead of putting it off until tomorrow. This means your operations teams have more time to worry about other things that impact availability for your users.
Bringing It All Together
Practical automation is about using your resources to their fullest potential. The potential of a script is as good as the way it’s written. The potential of a person is way, way beyond programming. With the right observability tools, such as the ones built into the Shoreline.io platform, you can enable your operations teams to do more than just close the same tickets over and over again. Automating things means those resources do the things they’re good at and the automation does the thing it is good at. And that’s the kind of enablement your operations team is begging to try out.
For more information about Shoreline.io and their automation and observability solution, please make sure to check out their website at http://Shoreline.io
Leave a Comment