G06F11/2257

Integrated remediation system for network-based services

This disclosure describes automatically collecting, analyzing, and remediating operational issues with respect to systems executing within a network. For example, a service provider network may include a monitoring service may generate notifications related to operational issues upon detection of operational issues within a system executing within the service provider network. The monitoring service may provide one or more notifications related to an aggregation service that may aggregate the one or more notifications into a standardized format. Contextual information related to the operational issues may be automatically gathered by an analytics service, which may analyze the contextual information to determine a potential cause of the operational issues. Based on the potential cause, a remediation service may automatically remediate the operational issues.

Systems, methods, and apparatuses for detecting and creating operation incidents

Techniques for determining insight are described. An exemplary method includes receiving a request to provide insight into potential abnormal behavior; receiving one or more of anomaly information and event information associated with the potential abnormal behavior; evaluating the received one or more of the anomaly information and event information associated with the abnormal behavior to determine there is insight as to what is causing the potential abnormal behavior and to add to an insight at least two of an indication of a metric involved in the abnormal behavior, a severity for the insight indication, an indication of a relevant event involved in the abnormal behavior, and a recommendation on how to cure the potential abnormal behavior; and providing an insight indication for the generated insight.

PREDICTIVE BATCH JOB FAILURE DETECTION AND REMEDIATION

Systems, methods, and computer programming products for predicting, preventing and remediating failures of batch jobs being executed and/or queued for processing at future scheduled time. Batch job parameters, messages and system logs are stored in knowledge bases and/or inputted into AI models for analysis. Using predictive analytics and/or machine learning, batch job failures are predicted before the failures occur. Mappings of processes used by each batch job, historical data from previous batch jobs and data identifying the success or failure thereof, builds an archive that can be refined over time through active learning feedback and AI modeling to predictively recommend actions that have historically prevented or remediated failures from occurring. Recommended actions are reported to the system administrator or automatically applied. As job failures occur over time, mappings of the current system log to logs for the unsuccessful batch jobs help the root cause analysis becomes simpler and more automated.

Measuring driving model coverage by microscope driving model knowledge

A computer-implemented method is provided for redundancy reduction for driving test scenarios. The method includes receiving an original test set of driving scenarios and a driving model which simulates a vehicle behavior under a driving scenario inputted to the driving model. The method includes, for each driving scenario of the original test set, obtaining vehicle dynamics timeseries data as an output of the driving model. The method includes determining similar driving scenarios by comparing driving model outputs. The method additionally includes creating a new test set of driving scenarios by discarding duplicated ones of the similar driving scenarios from the original test set.

Signal analysis method and test system

A signal analysis method of analyzing a performance of a device under test is described. A digitized input signal is obtained, wherein the digitized input signal is associated with the device under test. At least one characteristic quantity is determined via an artificial intelligence circuit. The artificial intelligence circuit includes at least one computing parameter. The at least one characteristic quantity is determined based on the digitized input signal and based on the at least one computing parameter. The at least one characteristic quantity is indicative of at least one performance property of the device under test. Further, a test system for analyzing a performance of a device under test as well as a computer program or program product are described.

Method and system for verifying state monitor reliability in hyper-converged infrastructure appliances

A method and system for verifying state monitor reliability in hyper-converged infrastructure (HCI) appliances. Specifically, the method and system disclosed herein entail using a supervised machine learning model—i.e., a classification decision tree—to accurately distinguish whether conflicting event notifications, logged across multiple state monitors tracking state on an HCI appliance, are directed to a real event or a non-real event. The classification decision tree, generated based at least on information gains calculated for the multiple state monitors, may reflect which state monitor(s) is/are more reliable in accurately classifying the conflicting event notifications.

Abnormality detection device, abnormality detection method, and storage medium

An abnormality detection device according to an embodiment includes a detector, a remover, and a learner. The detector detects first abnormal data in detection target data which is an abnormality detection target by inputting the detection target data to a first autoencoder which performed learning based on first learning target data which is a learning target. The remover removes data associated with the first abnormal data from the first learning target data to generate second learning target data by inputting the first learning target data to a second autoencoder which performed learning based on the first abnormal data detected by the detector. The learner causes the first autoencoder to perform learning based on the second learning target data generated by the remover.

TEST CONTROL DEVICE, TEST SYSTEM, AND CONTROL METHOD
20230080117 · 2023-03-16 ·

A test control device includes a test variable generation device and a test processing device. The test variable generation device uses a test prediction model to generate a first manipulated variable based on a difference between a target value and a first controlled variable value from a device under test. The test processing device acquires a second controlled variable value from the device under based on use of the first manipulated variable value. The test variable generation device notifies the device under test of end of a test if the second controlled variable value is equal to or greater than the target value or uses the test prediction model to generate a second manipulated variable based on a difference between the target value and the second controlled variable value when the second controlled variable value is less than the target value.

MEASURING DRIVING MODEL COVERAGE BY MICROSCOPE DRIVING MODEL KNOWLEDGE
20230081687 · 2023-03-16 ·

A computer-implemented method is provided for redundancy reduction for driving test scenarios. The method includes receiving an original test set of driving scenarios and a driving model which simulates a vehicle behavior under a driving scenario inputted to the driving model. The method includes, for each driving scenario of the original test set, obtaining vehicle dynamics timeseries data as an output of the driving model. The method includes determining similar driving scenarios by comparing driving model outputs. The method additionally includes creating a new test set of driving scenarios by discarding duplicated ones of the similar driving scenarios from the original test set.

Failure analysis system for a distributed storage system
11599435 · 2023-03-07 · ·

A failure analysis system identifies a root cause of a failure (or other health issue) in a virtualized computing environment and provides a recommendation for remediation. The failure analysis system uses a model-based reasoning (MBR) approach that involves building a model describing the relationships/dependencies of elements in the various layers of the virtualized computing environment, and the model is used by an inference engine to generate facts and rules for reasoning to identify an element in the virtualized computing environment that is causing the failure. Then, then the failure analysis system uses a decision tree analysis (DTA) approach to perform a deep diagnosis of the element, by traversing a decision tree that was generated by combining the rules for reasoning provided by the MBR approach, in conjunction with examining data collected by health monitors. The result of the DTA approach is then used to generate the recommendation for remediation.