AsiaInfo’s Road to Superior Products 2: AISWare AIOps Realizes the Transformation from Automation to Intelligence for Operations

2020-09-21 AsiaInfo

With the development of technologies such as 5G, cloud computing and microservices, traditional operations methods are confronted with challenges, and intelligent operations have become the wave of the future. AISWare AIOps, the global intelligent operations platform of AsiaInfo, is positioned to provide AIOps capability engine to make the operations systems of each domain intelligent and empower them. By combining the actual needs, continuing to refine, building differentiated advantages and implementation, the product has achieved good results and won a number of industry awards. In the future, the product will continue to develop with demand as the guide and forge ahead on the road of superior quality.

 

content_image_200921164901.jpg


I. Background


The traditional operations methods are confronted with challenges, and intelligent operation has become the wave of the future.

 

In the ICT field, operations are indispensable. Evolving from the manual operations at early stage, to script operations and tool operations, and to the current platform operations, the operation monitoring capabilities have begun to take shape after years of development. Traditional operations methods mostly rely on personal experience for analysis and decision-making. With the development and application of technologies such as 5G, cloud computing and microservices, the complexity of the system structure and the diversity of technical components have become more prominent, so that traditional operations methods are confronted with many challenges. For example, there are many unreasonable places in the alarm method of static threshold. It is difficult to manually define the alarm rules one by one in an accurate manner for thousands of indicators and instances; operation personnel receive thousands of alarms every day, which is difficult to accurately locate the root cause right away; in case of a failure, it is time-consuming and labor-intensive to rely on personal experience and rules to screen and process in many service nodes and platform indicators. Therefore, under the new situation, it is necessary to make breakthroughs in traditional operations methods and tools.

 

The main breakthrough to get rid of the dilemma of traditional operations lies in empowering operations with AI, turning the process of manually summarizing operation rules into an automatic learning process, and realizing intelligent failure discovery, diagnosis, handling and prevention. Artificial Intelligence for Operations (AIOps for short) has become a new racetrack in the field of operations. The latest report of Gartner points out that AIOps is one of the ten major technological trends in infrastructure and operations, and related industries are in a period of rapid development, with customer penetration reaching 40% in 2022; data from another market research company Markets and Markets also show that it is estimated that the market size of AIOps will increase from USD 2.55 billion in 2018 to UDS 11.02 billion by 2023; the Asia-Pacific region will become the fastest growing market during this period.


content_image_200921170409.png

 

II. AISWare AIOps


Provide AIOps capability engine to make the operations systems of each domain intelligent and empower them

 

1. Product positioning


AISWare AIOps, the global intelligent operations platform of AsiaInfo, is positioned to provide AIOps capability engine to make the operations systems of each domain intelligent and empower them. The product is based on the AI platform algorithm model of AsiaInfo and focusing on the three directions of quality assurance, cost management and efficiency improvement. Oriented at the intelligent solution capabilities of operations scenarios such as failure discovery, diagnosis, disposal, whole process of prediction as well as intelligent decision-making, intelligent question-answering, capacity planning and resource optimization, the product will conduct componentized encapsulation to build reusable and evolvable operations model and stipulation and provide capabilities to the external world in the form of Open API, in order to support the rapid implementation of intelligent operations requirements, simplify the connection process, and reduce the cost of intelligence integration and addition of the operations system.

 

2. Core functions


(1) Highly cohesive operations model and stipulation: Focusing on the three major directions of quality assurance, cost management and efficiency improvement, it provides highly cohesive and low-coupling scenario-based operations model and stipulation to connect with the production via lightweight capabilities, which supports the rapid implementation of intelligent operations requirements, and is easy to replicate and promote;


(2) Standardized API interface: With the algorithm model and reasoning protocol encapsulated in the model and stipulation, the integration with the third-party system through Open API interface and component makes it intelligent and empowers it.


(3) One-stop development and operation: It supports operations developers to combine their own business scenario requirements, to call existing services, quickly define, train, and release personalized operations model and stipulation, and provide operation management and operation monitoring services.

 

3. Product advantages


AISWare AIOps continues to refine by combining the actual needs. On the one hand, the scope of scenarios supported by the operations model and stipulation continues to be expanded according to the needs. At present, the product supports more than 40 AIOps scenarios. On the other hand, combined with the business understanding of the operations scenarios, the model is continuously optimized and the effect is improved. Currently, the product has a series of self-developed algorithm capabilities. Compared with open source algorithms, it is more adaptable to actual operations scenarios and performs well. Taking indicator anomaly detection as an example, compared with algorithms such as LSTM, if accurate failure discovery is achieved through self-developed algorithms, the accuracy can be increased by about 30% and resource consumption can be reduced by 50%, and recall and precision rates are also at the top of the industry.

 

After long-term refinement, the product gradually builds differentiated advantages:

 

(1) Abundant scenes: It fully supports multiple scenarios such as quality assurance, cost management and efficiency improvement, realizes the rapid implementation of intelligent operations requirements based on the ability of operations model and stipulation, and solves actual operations problems;


(2) Algorithm accumulation: Based on long-term accumulation and facing the needs of complex operations scenarios, the ability of model and stipulation is built with self-developed algorithm models, which is more adaptable to actual scenarios and achieves better results than open source algorithms;


(3) Component integration: Decoupled from the operations system, it provides componentized ability of model and stipulation, integrates with related systems through Open API, and simplifies the introduction method, which avoids repeated construction and reduces the cost of co-intelligence;


(4) Platform support: The combination of “platform + model and stipulation” realizes intelligent operations capability support, and provides integrated services of model and stipulation construction, opening, operation and management.

 

 

III. Cases


Practice the implementation scenarios in market, create value to gain recognition

 

At present, AISWare AIOps has been widely used in the intelligent operations practices of major domestic telecom operators, and is suitable for industries such as power, radio and television, finance and energy.

 

1. Application cases

 

Case 1: Golden indicator anomaly detection

AISWare AIOps was introduced in Kafka Topic traffic, business volume, load balance response delay and other indicator monitoring alarms of a telecom operator, to achieve intelligent anomaly detection of dynamic threshold, with an average of over 12 million calls per day, the recall rate of 99% and the precision rate around 90%, successfully predicting multiple failures.

 

Case 2: Root cause analysis and convergence of alarm

AISWare AIOps was introduced in the O-domain alarm convergence scenario of a telecom operator. Through the dynamic mining of alarm RCA rules, the root cause alarm was located in real time. The current alarm convergence rate is 98%, effectively alleviating the alarm storm.

 

Case 3: Microservice application system failure location

AISWare AIOps was introduced in the actual scenario of failure location of microservice call chain. From the perspective of the entire link, it located the root cause node of the calling chain through comprehensive analysis of the running status and calling relationship of each service, and further deduced and located the actual failure root causes in an intelligent manner combined with platform index operating data and topological relationships. From the actual test data, the average recall rate exceeds 85%, and the precision rate is 80%, which greatly shortens the failure location time.

 

2. Product value

 

(1) Improve overall operations efficiency: It supports the intelligent location of root cause of the failure and provides a handling strategy, reduces the dependence on personnel experience, greatly shortens the time for failure location, and improves the overall operations efficiency;


(2) Guarantee system operation quality: Through the early warning engine of failure, when the indicators appear to be cracked, early warning of business and system risks is given in time to avoid production failures and ensure the quality of system operation;


(3) Reasonably control operating costs: It supports intelligent assessment and optimization of resource efficiency, makes reasonable capacity planning, improves resource utilization efficiency to further control operating costs;


(4) Enhance per capita operations capabilities: By introducing AIOps capabilities and technologies, operation personnel are liberated from handling complicated alarms and high-frequency repetitive problems, and the per capita operations capabilities are enhanced.

 

3. Industry honors

 

Due to excellent commercial results and industry recognition, AISWare AIOps has won many industrial awards. It won the runner-up of Third International AIOps Challenge in 2020, won the “Best Catalyst” of TMF Asia Summit in 2019, and its “Cross-domain Intelligent Alarm Root Cause Analysis” and other cases were included in Smart Autonomous Network Case Report of GSMA and published on its portal.

 

 content_image_200921172834.png

 

content_image_200921173001.png


content_image_200921173113.png


IV. Outlook


Operations empowerment continues to develop, and the road of superior quality progresses forward

 

The AIOps market is gradually returning to rationality, and needs to pay more attention to the current urgent problems of IT/CT operations, that is, how to integrate the model and stipulation to form highly intelligent solutions with complex scenarios, and promote the application in operations, and finally realize the follow-up development direction of unattended intelligent operations. AISWare AIOps will also uphold the purpose and mission of making operations system intelligent and empowering it, and will continue to refine product capabilities with the first-line requirements as guide, and forge ahead on the road of superior quality.