G06F11/2257

Framework for UI automation based on graph recognition technology and related methods
11599449 · 2023-03-07 · ·

A GUI testing device may be configured to execute a testing state machine for interacting with a software application to generate an initial screen of a GUI. The GUI testing device may be configured to determine a current state in the testing state machine based upon a matching trigger target in the initial screen to a given state. The current state may include an operation, and the operation may associate with a trigger target to operate on. The trigger may include a source state, a destination state, and a trigger target. The operation may include a user input operation, and an operation trigger target. The GUI testing device may be configured to perform the operation on the matching trigger target in the initial screen to generate a next screen of the GUI, and advance from the current state to a next state based upon the trigger.

SYSTEMS AND METHODS FOR DATA-DRIVEN PROACTIVE DETECTION AND REMEDIATION OF ERRORS ON ENDPOINT COMPUTING SYSTEMS

Systems and methods for proactive support of computing assets are presented. In contrast to existing techniques of reactive support, the proactive support techniques disclosed herein automatically collect operating data from a plurality of computing devices, analyze the operating data to identify predictive indicators associated with error conditions, identify a subset of affected computing devices that match the predictive indicators, and execute corrective scripts to remediate or avoid such error conditions before problems are experienced on the affected computing devices. The operating data may be used to train a machine learning model in order to identify the predictive indicators associated with each error condition. In some embodiments, the corrective scripts may be automatically generated to adjust operating parameters or applications of the affected computing devices based upon the identified predictive indicators.

Storage device failure policies

Example implementations relate to a failure policy. For example, in an implementation, storage device status data is encoded into storage device states. An action is chosen based on the storage device state according to a failure policy, where the failure policy prescribes, based on a probabilistic model, whether for a particular storage device state a corresponding action is to take no action or to initiate a failure mitigation procedure on a storage device. The failure policy is rewarded according to a timeliness of choosing to initiate the failure mitigation procedure relative to a failure of the storage device.

Intelligent condition monitoring and fault diagnostic system for preventative maintenance

A system for condition monitoring and fault diagnosis includes a data collection function that acquires time histories of selected variables for one or more of the components, a pre-processing function that calculates specified characteristics of the time histories, an analysis function for evaluating the characteristics to produce one or more hypotheses of a condition of the one or more components, and a reasoning function for determining the condition of the one or more components from the one or more hypotheses.

Detecting datacenter mass outage with near real-time/offline using ml models

The present embodiments relate to data center outage detection and alert generation. An outage detection service as described herein can process near real-time data from various sources in a datacenter and process the data using a model to determine one or more projected sources of a detected outage. The model as described herein can include one or more machine learning models incorporating a series of rules to process near-real time data and offline data and determine one or more projected sources of an outage. An alert message can be generated to provide the projected sources of the outage and other data relevant to the outage.

Predictive batch job failure detection and remediation

Systems, methods, and computer programming products for predicting, preventing and remediating failures of batch jobs being executed and/or queued for processing at future scheduled time. Batch job parameters, messages and system logs are stored in knowledge bases and/or inputted into AI models for analysis. Using predictive analytics and/or machine learning, batch job failures are predicted before the failures occur. Mappings of processes used by each batch job, historical data from previous batch jobs and data identifying the success or failure thereof, builds an archive that can be refined over time through active learning feedback and AI modeling to predictively recommend actions that have historically prevented or remediated failures from occurring. Recommended actions are reported to the system administrator or automatically applied. As job failures occur over time, mappings of the current system log to logs for the unsuccessful batch jobs help the root cause analysis becomes simpler and more automated.

AUTOMATIC ROOT CAUSE ANALYSIS USING TERNARY FAULT SCENARIO REPRESENTATION
20220365836 · 2022-11-17 ·

A plurality of potential fault scenarios are accessed, wherein a given potential fault scenario of the plurality of potential fault scenarios has at least one corresponding root cause, and a representation of the given potential fault scenario comprises a don't care value. An actual fault scenario from telemetry received from a monitored system is generated. The actual fault scenario is matched against the plurality of potential fault scenarios. One or more matched causes are output as one or more probable root cause failures of the monitored system.

Automated evaluation of test logs

An automated evaluation of test logs for the testing of telecommunications equipment includes a probabilistic model that links possible events in a test log with possible causes for the event. Probability values for possible causes are calculated from the probabilistic model and a search result, and a reference to a possible cause is provided in an output based upon the calculated probability values.

Method and device for testing a technical system

A method for testing a technical system. The method includes: tests are carried out with the aid of a simulation of the system, the tests are evaluated with respect to a fulfillment measure of a quantitative requirement on the system and an error measure of the simulation, on the basis of the fulfillment measure and error measure, a classification of the tests as either reliable or unreliable is carried out.

Coordinating fault recovery in a distributed system

In various embodiments, methods and systems for coordinating, between a host and a tenant, fault recovery of tenant infrastructure in a distributed system is provided. A fault occurrence is determined for a tenant infrastructure in the distributed system. The fault occurrence may be a software failure or hardware failure of the tenant infrastructure supporting a service application of the tenant. A fault recovery plan is communicated to the tenant to notify the tenant of the fault occurrence and actions taken to restore the tenant infrastructure. It is determined whether a fault recovery plan response is received from the tenant; the fault recovery plan response is an acknowledgement from the tenant of the fault recovery plan. Upon receiving the fault recovery plan response or at the expiration of a predefined time limit, the fault recovery plan is executed to restore the tenant infrastructure.