Patent classifications
G06F11/0721
RECONSTRUCTING EXECUTION CALL FLOWS TO DETECT ANOMALIES
Systems and methods of reconstructing execution call flows to detect anomalies is provided. A device can establish call flows using information extracted from a log file to. Each of the call flows can identify information from the log file of a call flowing through a plurality of modules. The device can identify a count of a number of occurrences of one or more keywords in information of each call flow. The device can generate a vector of numbers for each call flow based at least on the count for the one or more keywords for that call flow. The device can classify each call flow into one or more clusters that indicate whether an operation of the call flow is anomalous. The device can classify each call flow using the vector of numbers for each call flow.
INFORMATION PROCESSING SYSTEM, METHOD, AND APPARATUS
An information processing system, method, and apparatus reduces maintenance costs and management work and expedites countermeasures. A guide for a new event is selected based on information transmitted from the monitoring target node at which the new event has occurred; whether a countermeasure designated by the guide selected for the new event can be executed or not is judged; under this circumstance, past events having similarity to the new event which has occurred at the monitoring target node are identified; and if countermeasures against a specified last number of the past events among the identified past events have been successful and a countermeasure against the past event which is the latest and is more similar to the new event among the past events identified as the new event has been successful, it is judged that the countermeasure designated by the guide selected by the guide selection unit should be executed.
SYSTEMS AND METHODS FOR MARGIN BASED DIAGNOSTIC TOOLS FOR PRIORITY PREEMPTIVE SCHEDULERS
In one embodiment, a method for margin determination for a computing system with a real time operating system and priority preemptive scheduling comprises: scheduling a set of tasks to be executed in one or more partitions, wherein each is assigned a priority, wherein the tasks comprise periodic and/or aperiodic tasks; executing the set of tasks on the computing system within the scheduled periodic time window; introducing an overhead task executed for an execution duration controlled either by the real time operating system or by the overhead task; controlling the overhead task to converge on a point of failure at which a length of the execution duration of the overhead task causes either: 1) a periodic task to fail to execute within a deadline, or 2) time available for the aperiodic tasks to execute to fall below a threshold; and defining a partition margin corresponding to the point of failure.
HIERARCHICAL NEURAL NETWORK-BASED ROOT CAUSE ANALYSIS FOR DISTRIBUTED COMPUTING SYSTEMS
Methods and systems for detecting and responding to an anomaly include determining a first system-level performance prediction using system-level statistics. A second system-level performance prediction is determined using system-level statistics and service-level statistics. The first prediction to the second prediction are compared to identify a discrepancy. It is determined that a service corresponding to the service-level statistics is a cause of a detected failure in a distributed computing system. An action directed to the service is performed responsive to the detected failure.
Fault Tree Generation Device and Fault Tree Generation Method
This failure tree generation device includes: a causal relationship storage unit that stores information indicating a linkage of the causal relationship of defects of respective component parts constituting subjects to be analyzed in a manner such that the information is associated with the connection relationship of the respective component parts; a system-level failure tree generation unit that generates, for each of the component parts and on the basis of component part constitution information indicating the constitution of component parts to be analyzed and component part connection information indicating the connection relationship of the respective component parts, first element information which is information indicating disturbance having occurred in information transfer between the respective component parts and which indicates the relationship between each of the component parts and a phenomenon having occurred on the component part, and generates, on the basis of the respective items of first element information, system level failure tree information indicating the causal relationship of defects of the component parts; and an equipment/component part level failure tree generation unit that searches for the causal relationship storage unit on the basis of the component part constitution information and the component part connection information, that generates two or more items of second element information indicating information relating to a plurality of events connecting to any one first element and indicating the relationship between the respective component parts and phenomena having occurred on the respective component parts, and that generates, on the basis of the respective items of the second element information, equipment/component part level failure tree information indicating the causal relationship of defects of the component parts, by way of a hierarchical structure.
Error detection using vector processing circuitry
A data processing apparatus (2) has scalar processing circuitry (32-42) and vector processing circuitry (38, 40, 42). When executing main scalar processing on the scalar processing circuitry (32-42), or main vector processing using a subset of said plurality of lanes on the vector processing circuitry (38, 40, 42), checker processing is executed using at least one lane of the plurality of lanes on the vector processing circuitry (38, 40, 42), the checker processing comprising operations corresponding to at least part of the main scalar/vector processing. Errors can then be detected based on a comparison of an outcome of the main processing and an outcome of the checker processing. This provides a technique for achieving functional safety in a high end processor with better performance and reduced hardware cost compared to a dual/triple core lockstep approach.
Autonomous release management in distributed computing systems
Implementations described herein relate to methods, systems, and computer-readable media to provide an alert based on a release of a software application implemented in a distributed computing system. In some implementations, the method includes receiving, at a processor, an indication of the release of the software application, obtaining a first set of metric values for each metric of a list of metrics for a first time period preceding a time of release of the release, obtaining a second set of metric values for each metric of the list of metrics for a second time period following the time of release, comparing the first set of metric values to the second set of metric values to determine a deviation score, generating an alert based on the deviation score, and transmitting the alert via one of a user interface and a communication channel.
DEVICES AND COMPONENTS FOR MACHINE LEARNING-BASED SIGNAL ERROR CORRECTION AND METHODS OF USE THEREOF
The present disclosure enables signal error correction using a first processor and a memory on a first substrate, where the first processor is operationally connected to a second processor on a second substrate and the memory stores computer code having a machine learning model. The first processor executes computer code to: automatically receive from the second processor, a first output signal intended to be received by a target recipient device. The first processor automatically inputs the first output signal into the machine learning model, where the machine learning model determines that the first output signal includes an error signal that would cause a malfunction in the target recipient device, and output an instruction to cause the first processor to generate a second output signal that corrects the error signal. The first processor automatically generates the second output signal and transmits the second output signal to the target recipient device.
Methods and articles of manufacture for hosting a safety critical application on an uncontrolled data processing device
Methods and articles of manufacture for hosting a safety critical application on an uncontrolled data processing device are provided. Various combinations of installation, functional, host integrity, coexistence, interoperability, power management, and environment checks are performed at various times to determine if the safety critical application operates properly on the device. The operation of the SCA on the UDPD may be controlled accordingly.
SYSTEM AND METHOD FOR IMPROVED CONTROL FLOW MONITORING OF PROCESSORS
A mechanism is provided to monitor control flow failure of processors having a simple processing pipeline (e.g., RISC5) or accelerators (e.g., digital signal processors). Embodiments have a monitoring entity attached to the processor that does not interfere with the normal functionality of the accelerator. By virtue of being closely associated with the processor, the failure detection period can be smaller than that of a typical host watchdog and can be defined as per the needs of the application. In some embodiments, the failure detection period is defined by the number of clock cycles needed for the largest basic block in the executed code.