Sercomp Business Technology provides essential cloud services to approximately 60 corporate clients, supporting a total of approximately 50,000 users. Therefore, it is crucial that Joinville, Brazil, the company’s underlying IT infrastructure provides a reliable service with predictably high performance. But with a complex IT environment that includes more than 2,000 virtual machines and 1 petabyte – the equivalent of one million gigabytes – of managed data, it was overwhelming for network administrators to sort through all the data and alerts about what to do when problems occur. Was happening. Crop. And it was difficult to determine where the network and storage capacity should be or when to do the next upgrade.
To help solve complexities and increase the efficiency of its assistant engineers, Sercompe has invested in the Artificial Intelligence Operations (AIOps) platform, which uses AI to get to the root of problems and warn IT administrators before small problems get bigger. Now, according to Cloud Product Manager Rafael Cardoso, the AIOps system does most of the work to manage its IT infrastructure – a big boon over older manual methods.
“Finding out when I needed more space or capacity – was a mess before. We needed to get information from many different issues when we were planning. We never got the right number, “says Cardoso. “Now, I have a complete view of the visualization from the infrastructure and virtual machines to the final disk in the rack.” AIOps bring visibility into the entire environment.
Before using technology, Cardoso was a place where numerous other organizations found themselves: interconnected between the layers of hardware, virtualization, middleware, and finally applications, entangled in the complex web of IT systems. Any disruption or downtime can lead to boring manual troubleshooting, and ultimately, a negative impact on the business: a website that won’t work, for example, and angers customers.
AIOps platforms help IT managers master the task of automating IT operations to provide a quick insight into how infrastructure is performing using AI – areas that are buzzing against locations that are at risk of triggering a downtime event. Gartner is credited with coining the term AIOps in 2016: a wide range of traditional tools designed to overcome the limitations of monitoring tools. Platforms use self-learning algorithms to automate routine tasks and understand the behavior of the system they observe. They draw insights from performance data to identify and monitor irregular behavior on IT infrastructure and applications.
Market research firm BCC Research estimates that the global market for AIOps will grow from $ 3 billion in 2021 to $ 9.4 billion by 2026, at a compounded rate of 26% .1. Gartner analysts write in their April “Market Guide for AIOps Platforms” that the increasing rate of adoption of AIOps is driven by the need to move from reactive responses to digital business transformation and infrastructure issues to proactive actions.
As data volumes reach or exceed gigabytes per minute in dozens or more different domains, Gartner analysts write, it is no longer possible for humans to manually analyze data. Systematically applying AI increases the speed of insight and enables activation.
According to Mark Esposito, chief learning officer at automation technology company Nexus FrontierTech, the term “AIOps” evolved from “DevOps” – a software engineering culture and practice aimed at integrating software development and operations. “The idea is to advocate automation and monitoring at all stages, from software construction to infrastructure management,” says Esposito. Recent innovations in this area include the use of predictive analytics to anticipate and resolve problems before they affect IT operations.
AIOps help to blur the infrastructure in the background
Saurabh Kulkarni, head of engineering and product management at Hewlett-Packard Enterprise, says network and IT administrators who are annoyed by data volume and increasing complexity can use the help. Kulkarni works at HPE Infosite, a cloud-based AIOps platform for actively managing data center systems.
“IT administrators spend a lot of time planning their work, planning deployments, adding new nodes, calculating, storing and everything. And when something goes wrong in the infrastructure, it’s very difficult to manually debug those issues, “says Kulkarni. “AIOps use machine-learning algorithms to look at patterns, test repetitive behaviors, and learn from users for quick recommendations.” In addition to the storage nodes, each part of the IT infrastructure will send a separate alert so that problems can be resolved quickly.
The Infosite system collects data from all devices in the customer’s environment and then integrates it with data from HPE customers in the same IT environment. The system can determine a potential problem so that it can be resolved quickly. If the problem reappears, the fix can be applied automatically. Alternatively, the system sends alerts so that IT teams can resolve the issue quickly, Kulkarni adds. Take the case of storage controller which failed because it does not have power. Instead of assuming the problem is only storage-related, the AIOps platform surveys the entire infrastructure stack up to the application level to identify the root cause.
“The system monitors performance and can detect discrepancies. We have algorithms that constantly run in the background to detect any abnormal behaviors and warn customers before a problem occurs,” says Kulkarni. The philosophy behind Infosite is IT systems and all telemetry. Bringing data into a glass pane is to “make the infrastructure invisible.” Looking at a huge set of data, administrators can quickly understand what’s going wrong with the infrastructure.
Kulkarni recalls the difficulty of managing a large IT environment from past jobs. “I had to manage a big data set, and I had to call a lot of vendors and stay on hold for hours trying to find problems,” he says. “Sometimes it took us days to figure out what was really going on.”
By automating data collection and tapping data assets to understand root causes, AIOps enables companies to reinstate key employees, including IT administrators, storage administrators, and network administrators, simplifying infrastructure, streamlining roles for more operations and doing. “Previously, companies had multiple roles and different departments handling different things. So even to deploy the new storage area, five different admins had to do their own individual work, ”says Kulkarni. But with AIOps, AI handles most of the work automatically so that IT and support staff can devote their time to taking more strategic initiatives, increasing efficiency and, in the case of a business that provides technical support to its customers, to improve profit margins. Sercompe’s Cardoso, for example, is able to reduce the average time their support engineers spend on customer calls, reflecting a better customer experience while increasing efficiency.
Download the full report.
This content was created by Insights, the custom content arm of MIT Technology Review. It was not written by the editorial staff of MIT Technology Review.