Patent classifications
H04L41/0636
FAILURE IMPACT ANALYSIS OF NETWORK EVENTS
Failure impact analysis (or “impact analysis”) is a process that involves identifying effects of a network event that are may or will results from the network event. In one example, this disclosure describes a method that includes generating, by a control system managing a resource group, a resource graph that models resource and event dependencies between a plurality of resources within the resource group; detecting, by the control system, a first event affecting a first resource of the plurality of resources, wherein the first event is a network event; and identifying, by the control system and based on the dependencies modeled by the resource graph, a second resource that is expected to be affected by the first event.
METHOD AND SYSTEM FOR ROOT CAUSE ANALYSIS ACROSS MULTIPLE NETWORK SYSTEMS
Method and system for Root Cause Analysis (RCA) across multiple network systems. Update information of a first local root cause analysis mechanism is received. An RCA controller generates, based on the update information, a new node to be added to a global root cause decision tree, where the global root cause decision tree is to be shared by at least two of the plurality of network operators. The RCA controller requests storage of the new node in a distributed ledger that is shared by network operators. The RCA controller participates in a verification operation of the new node. In response to determining that the verification operation is successful, the RCA controller adds an entry including the new node to the distributed ledger as part of the global root cause decision tree. Alternatively, when the verification operation is not successful, the new node is not added to the distributed ledger.
Switching among multiple machine learning models during training and inference
Systems and methods for analyzing and prioritizing alarms in a communications network are provided. A method, according to one implementation, includes the step of obtaining network information regarding the condition of a network. Using the network information, the method further includes performing a hybrid Machine Learning (ML) technique that includes training and inference of a plurality of ML models to calculate metrics of the network. Also, the method includes the step of selecting one of the plurality of ML models based on a combination of the metrics.
Directed incremental clustering of causally related events
Described systems and techniques determine causal associations between events that occur within an information technology landscape. Individual situations that are likely to represent active occurrences requiring a response may be identified as causal event clusters, without requiring manual tuning to determine cluster boundaries. Consequently, it is possible to identify root causes, analyze effects, predict future events, and prevent undesired outcomes, even in complicated, dispersed, interconnected systems.
SYSTEM AND METHOD FOR ANOMALY DETECTION WITH ROOT CAUSE IDENTIFICATION
A computer device may include a processor configured to obtain key performance indicator (KPI) values for KPI parameters associated with at least one device and compute a set of historical statistical values for the obtained KPI values associated with the network device. The processor may be further configured to provide the KPI values and the computed set of historical statistical values to an anomaly detection model to identify potential anomalies; filter the identified potential anomalies based on a designated desirable behavior for a particular KPI parameter to identify at least one anomaly; and send an alert that includes information identifying the at least one anomaly to a management system or a repair system associated with the device. The computer device may further determine a root cause KPI parameter for the identified at least one anomaly and include information identifying the determined root cause KPI parameter in the alert.
Software-defined network monitoring and fault localization
The disclosure describes techniques for network monitoring and fault localization. For example, a controller comprises one or more processors operably coupled to a memory configured to: receive a first one or more Quality of Experience (QoE) metrics measured by a first probe traversing a first path comprising one or more links; receive a second one or more QoE metrics measured by a second probe traversing a second path comprising one or more links; determine, from the first one or more QoE metrics, that the first path has an anomaly; determine, from the second one or more QoE metrics, that the second path has an anomaly; and determine, in response to determining the first path and the second path has an anomaly, based on the type of metrics and the type of links, that an intersection between the first path and the second path is a root cause of the anomaly.
Automatic diagnostics alerts
Generating automatic diagnostics alerts is disclosed. At a first time, a set of quality metrics for a plurality of groups of streaming sessions is computed. An anomaly is identified at least in part by performing anomaly detection using the set of quality metrics and historical information. A cause of the identified anomaly is diagnosed. An alert is generated based at least in part on the diagnosis.
Learning based incident or defect resolution, and test generation
In some examples, learning based incident or defect resolution, and test generation may include ascertaining historical log data that includes incident or defect log data associated with operation of a process, and generating, based on the historical log data, step action graphs. Based on grouping of the step action graphs with respect to different incident and defect tickets, an incident and defect action graph may be generated to further generate a machine learning model. Based on an analysis of the machine learning model with respect to a new incident or defect, an output that includes a sequence of actions may be generated to reproduce, for the new incident, steps that result in the new incident, reproduce, for the new defect, an error that results in the new defect, identify a root cause of the new incident or defect, and/or resolve the new incident or defect.
FAILURE IMPACT ANALYSIS OF NETWORK EVENTS
Failure impact analysis (or “impact analysis”) is a process that involves identifying effects of a network event that are may or will results from the network event. In one example, this disclosure describes a method that includes generating, by a control system managing a resource group, a resource graph that models resource and event dependencies between a plurality of resources within the resource group; detecting, by the control system, a first event affecting a first resource of the plurality of resources, wherein the first event is a network event; and identifying, by the control system and based on the dependencies modeled by the resource graph, a second resource that is expected to be affected by the first event.
CELL ACCESSIBILITY PREDICTION AND ACTUATION
A method for predicting cell accessibility issues for a mobile network. The method includes receiving a set of metrics from the mobile network, processing a set of key performance indicators (KPIs) derived from the set of metrics in an ensemble machine learning model, the ensemble machine learning model including an RRC model, an RACH model, an ERAB model, and an S1 signaling model to generate at least one cell accessibility degradation prediction and a confidence score, and applying a root cause mapping to the at least one cell accessibility degradation prediction and the confidence score to identify at least one recommended action to correct a correlated cell accessibility issue.