Should Machine Learning Be Applied to IT Operational Tools?

Image Attribution:

Machine learning is often viewed as a new technology, yet the concept of algorithms being applied to machines in order to learn and make predictions based on data has actually been a field of research for over 50 years. Arthur Samuel, an American pioneer in the field of computer gaming, defined machine learning in 1959 as a “field of study that gives computers the ability to learn without being explicitly programmed.” The concept of machine learning really began to flourish in the 1990s with the rise of the Internet and increasing availability of digital information.
It’s no surprise that the hype around machine learning has only increased over the years, especially as data-driven algorithms are being applied more commercially. Machine learning is now viewed as a “silver bullet” for analyzing large data sets in order to predict certain outcomes. The IT sector in particular has taken a strong interest in machine learning-based technology, especially as IT environments become software-defined and are generating large volumes of data in real-time.
IT operations (Ops) teams are now seeking “smarter” tools that can instantly detect unusual behaviors within complex IT environments. Yet, determining whether or not to deploy a machine learning-based IT Ops tool has become a difficult task. Understanding which elements to consider before investing in a machine learning IT operational tool is now critical. With this in mind, IT leaders should think about these three questions before taking a machine learning-based approach for IT Ops:

1) What are you trying to accomplish?

Machine learning isn’t magic that can be applied just anywhere. In fact, machine learning based tools perform better in certain environments, which is why it’s important to consider your goals. Do you want to use a machine learning tool to help with security, cloud adoption, or moving from a “brownfield” to “greenfield” IT environment (i.e. an IT environment based on legacy tools to one made up of new technologies/tools)?
For instance, let’s say you want to use a machine learning tool for security purposes. Amazon recently announced a new AWS service called “Amazon Machine Learning.” The goal behind this service is to use powerful algorithms to create machine learning models by finding patterns in existing data. Amazon Machine Learning then uses these models to process new data and generate predications for your application.
While Amazon’s machine learning service does boast a variety of benefits, such as automatically analyzing data in real-time to quickly identify potential threats within a system, a retrospective analysis might actually be a better approach here. Splunk, for instance, searches across terabytes of data from traditional security sources, custom applications and databases. Splunk then provides a timeline view of all the collected data, including historical data. This data can also be saved and used to monitor alerts based on the findings from the past data. A machine learning approach isn’t required at all in the instance.
When it comes to cloud adoption and moving from a brownfield to a greenfield IT environment, however, a machine learning-based approach proves to be the best solution due to the constant change that requires analysis of big data in real-time. Scale and change have gotten to the point where traditional rules-based approaches simply don’t work anymore.
If you are trying to assure service and guarantee up-time on a business critical applications, for instance, IT teams need to spot unusual behaviors in a timely fashion. If you’re trying to do this within a virtualized, software-defined IT environment, then a machine learning-based approach is the only solution that can immediately spot anomalies. If you are analyzing incoming tweets to predict a certain outcome, for example, you shouldn’t rely on a retrospective analysis of data (past tweets). Rather, you’d want to analyze tweets in real-time to spot anomalies as they occur in order to make predictions for future outcomes.

2) Should you rely on a rules-based system?

Machine learning spots patterns in data in real-time to help understand large data sets better. In contrast, a rules-based model is built around an infrastructure that produces events (anything that has a time stamp and a message associated describing that something has happened) to try to understand the meaning of these events in context with other events.
A rules-based infrastructure works well in that some rules never change. Yet other rules require maintenance, making it difficult for a rules-based system to predict the model of events that would allow IT Ops teams to understand key information in the context of all the other events and data being monitored.
As IT environments become more complex, legacy rules-based systems are being seen as deterministic. Rules-based infrastructures result in inconsistencies and ambiguities that IT teams simply can’t cope with, which is why more machine learning approaches are now being applied. A machine-learning approach is data-driven, not based around models, so flexible logic can be applied in real-time. This is important, especially when you have an infrastructure that’s in a constant state of flux.

3) How does a machine learning tool compliment your other tools?

According to a 2015 Application Performance Monitoring survey, IT teams own between 11-25 different monitoring tools. Yet research indicates that only about 27 percent of application-related problems are detected by monitoring tools. To make matters worse, IT teams are notified of application-related issues through user calls approximately 25 percent of the time. How can this problem be solved?
A machine learned-based approach could help enhance monitoring tools’ limited features. For instance, large organizations use New Relic to detect service failures by monitoring their applications to avoid critical problems. A machine learning-based IT operational tool, could sit on top of New Relic or another service to correlate and makes sense of these alerts, while detecting the root-cause of problems.
Machine learning algorithms ultimately provide the next piece of the puzzle for monitoring tools by ingesting data from these tools and pulling together all of the information within a single place, unifying these data points and analytics to create a real-time, single-pane-of-glass solution.

Is a machine-learning tool right for you?

It’s clear that machine learning-based tools have a number of benefits for IT Ops. Machine learning techniques being applied in virtualized, cloud-based IT environments have become a must due to the constant scale and change occurring. Rules-based systems simply don’t provide the agility required for evolving IT infrastructures. Furthermore, machine learning techniques complement monitoring tools, providing root-cause analysis and real-time detection of incoming alerts.
While machine learning is still advancing, one thing remains certain. Whether machine learning is being applied now or later, this technology will be adopted more in the next two to three years. The ability to build CMDBs using a bottom-up approach will eventually become impossible due to the scale and change IT environments are facing with cloud adoption and moving from brownfield to greenfield ops.
Disclaimer: This article was written by a guest contributor in his/her personal capacity. The opinions expressed in this article are the author’s own and do not necessarily reflect those of