H04L41/064

EVENT RELATIONSHIP ANALYSIS IN FAULT MANAGEMENT
20170235623 · 2017-08-17 ·

Method and system are provided for event relationship analysis in fault management. The method includes: providing a history of a plurality of event instances relating to multiple events identified by event identifiers, where an event instance has one or more event occurrences referencing an event identifier, the history including the event occurrences and resolution event information; analyzing the event occurrences relating to each event identifier to identify a first occurrence(s) of an event instance; analyzing the resolution event information relating to each event identifier to identify any event resolution time for an event instance; comparing two event identifiers to obtain a relationship score between the two event identifiers, wherein the comparing is based on a combination of first occurrences of event instances relating to the two event identifiers and resolution times of the event instances; and creating a group of events that are related based on the relationship scores.

Managing failure behavior for computing nodes of provided computer networks
09736016 · 2017-08-15 · ·

Techniques are described for providing managed computer networks. In some situations, the techniques include managing communications for computing nodes of a managed computer network by using one or more particular computing nodes of the managed computer network that are configured to operate as intermediate destinations to handle at least some communications that are sent by and/or directed to one or more other computing nodes of the managed computer network. In addition, the techniques may include managing the communications in accordance with configured failure behavior specified for one or more computing nodes of the computer network, such as specified failure behavior for a computing node configured to operate as an intermediate destination that indicates how communications that would otherwise be routed via the intermediate destination computing node are to be handled if the intermediate destination computing node fails or is otherwise unavailable (e.g., to block or allow such communications).

SYSTEM AND METHOD FOR DE-ANONYMIZING ACTIONS AND MESSAGES ON NETWORKS

A traffic-monitoring system that monitors encrypted traffic exchanged between IP addresses used by devices and a network, and further receives the user-action details that are passed over the network. By correlating between the times at which the encrypted traffic is exchanged and the times at which the user-action details are received, the system associates the user-action details with the IP addresses. In particular, for each action specified in the user-action details, the system identifies one or more IP addresses that may be the source of the action. Based on the IP addresses, the system may identify one or more users who may have performed the action. The system may correlate between the respective action-times of the encrypted actions and the respective approximate action-times of the indicated actions. The system may hypothesize that the indicated action may correspond to one of the encrypted actions having these action-times.

ROOT-CAUSE ANALYSIS OF EVENT OCCURRENCES

Provided herein are systems and methods for determining relationships between events occurring in networks. Notifications describing events occurring in networks can be received and processed to determine groups of network event types. A root-cause network can be generated based on the events, with the nodes of the root-cause network representing different event types and the edges of the root-cause network indicating directional, causal relationships between the nodes. A received network event can be processed to determine potential causes of the received network event based on the root-cause network and other events received by the network.

SYSTEMS AND METHODS FOR PREDICTIVE ASSURANCE
20220038330 · 2022-02-03 ·

Systems and methods are provided for predicting system or network failures, such as a degradation in the services provided by a service provider system or network, a system or network outage, etc. In a discovery phase, failure scenarios that can be accurately predicted based on monitored system events are identified. In an operationalization phase, those failure scenarios can be used to design production run-time machines that can be used to predict, in real-time, a future failure scenario. An early warning signal(s) can be sent in advance of the failure scenario occurring.

System and method for remote maintenance of user units
09727404 · 2017-08-08 · ·

A system and method for remote maintenance of user units allows efficient diagnosis of failures in a reduced time. Each user unit transmits to a management server, via a network, state data related to hardware and software parameters associated to an operating mode of the user unit. The method includes: storing state data in a user unit memory, monitoring state data stored in the memory, and detecting at least one datum of a state indicating an operational failure of the user unit. When a failure is detected, state data corresponding to current states of the user unit at the moment of the failure and state data corresponding to states stored during a predetermined period before the failure are extracted and transmitted to the management server which determines a statistic correlation coefficient between the values of each state of a user unit and the values of states of other user units.

SYSTEM AND METHOD FOR ANOMALY DETECTION WITH ROOT CAUSE IDENTIFICATION

A computer device may include a processor configured to obtain key performance indicator (KPI) values for KPI parameters associated with at least one device and compute a set of historical statistical values for the obtained KPI values associated with the network device. The processor may be further configured to provide the KPI values and the computed set of historical statistical values to an anomaly detection model to identify potential anomalies; filter the identified potential anomalies based on a designated desirable behavior for a particular KPI parameter to identify at least one anomaly; and send an alert that includes information identifying the at least one anomaly to a management system or a repair system associated with the device. The computer device may further determine a root cause KPI parameter for the identified at least one anomaly and include information identifying the determined root cause KPI parameter in the alert.

FAULT DETECTION OF SERVICE CHAINS IN A SDN/NFV NETWORK ENVIRONMENT

An electronic device includes a processor and a memory coupled to the processor and storing computer readable program code that when executed by the processor causes the processor to perform operations including generating, at given time intervals, a plurality of topology graphs that correspond to a service chain that comprises a plurality of virtual network functions (VNFs) and that is operating in a software defined network (SDN)/network function virtualization (NFV) computing environment, each of the plurality of topology graphs corresponding to a different one of the time intervals. Operations may include comparing a first one of the plurality of topology graphs that is received at a first time to a second one of the plurality of topology graphs that is received at a second time that is after the first time to determine if the service chain has a fault.

Method and apparatus for analysis of the operation of a communication system using events

The present invention relates to a method and apparatus for event analysis in a communication (telecommunication or computer) system. In particular, the invention relates to a method and apparatus for analyzing events representing activity in the communication system. Embodiments provide a progressive technique for the analysis of the operation of a communication system. Embodiments provide a bottom-up approach by first detecting burst of events, and establishing causal relationships between events and system operation reports using detected event burst records representing the occurrence of burst behaviors in events in a system. Based on the causal relationships found, causes of a change in system operation may be identified by determining parameters associated with events of an event burst relevant to the change in system operation.

Method and system for real-time, false positive resistant, load independent and self-learning anomaly detection of measured transaction execution parameters like response times

A combined transaction execution monitoring, transaction classification and transaction execution performance anomaly detection system is disclosed. The system receives and analyzes transaction tracing data which may be provided by monitoring agents deployed to transaction executing entities like processes. In a first classification stage, parameters are extracted from received transaction tracing data, and the transaction tracing data is tagged with the extracted classification data. A subsequent measure extraction stage analyzes the classified transaction tracing data and creates corresponding measurements which are tagged with the transaction classifier. A following statistical analysis process maintains statistical data describing the long term statistical behavior of classified measures as a baseline, and also calculates corresponding statistical data describing the current statistical behavior of the classified measures. The statistical analysis process detects and notifies significant deviations between the statistical distribution of baseline and current measure data. A subsequent anomaly alerting and visualization stage processes those notifications.