G06F11/0766

Adaptive telemetry sampling

A data processing system implements adaptive telemetry sampling by obtaining first telemetry data from a plurality of telemetry data sources, analyzing the first telemetry data to identify a subset of telemetry data sources for which a reduced sampling rate may be implemented, determining a reduced sampling rate for each event type of the plurality of event types, selecting a subset of the event types for which the reduced sampling rate is to be applied, obtaining second telemetry data from the subset of telemetry data sources at the reduced sampling rate associated with each event type of the subset of event types, analyzing the second telemetry data to determine one or more estimated metric values for one or more metrics, and generating a report comprising the one or more estimated metric values and an estimated total cost saving based on an estimated cost saving associated with each event type.

Virtual machine fault tolerance

System and method for providing fault tolerance in virtualized computer systems use a first guest and a second guest running on virtualization software to produce outputs, which are produced when a workload is executed on the first and second guests. An output of the second guest is compared with an output of the first guest to determine if there is an output match. If there is no output match, the first guest is paused and a resynchronization of the second guest is executed to restore a checkpointed state of the first guest on the second guest. After the resynchronization of the second guest, the paused first guest is caused to resume operation.

Multi-level caching to deploy local volatile memory, local persistent memory, and remote persistent memory
11593186 · 2023-02-28 · ·

A technique is introduced for applying multi-level caching to deploy various types of physical memory to service captured memory calls from an application. The various types of physical memory can include local volatile memory (e.g., dynamic random-access memory), local persistent memory, and/or remote persistent memory. In an example embodiment, a user-space page fault notification mechanism is used to defer assignment of actual physical memory resources until a memory buffer is accessed by the application. After populating a selected physical memory in response to an initial user-space page fault notification, page access information can be monitored to determine which pages continues to be accessed and which pages are inactive to identify candidates for eviction.

MANAGEMENT AND REMEDIATION OF DATABASE ISSUES

Systems and methods are described identify a database metric value associated with a database instance storing a dataset associated with a user system. A database issue is detected in view of a determination that the database metric value satisfies a condition. In response to satisfaction of the condition, a set of user action metrics associated with the user system is collected from one or more data monitoring systems. At least one notification communication is generated including at least a portion of the set of user action metrics and information identifying the database issue. The at least one notification communication is transmitted to a remediation execution system configured to execute, using the at least a portion of the set of user action metrics and information identifying the database issue, a remedial action in response to the database issue.

Method for encoded diagnostics in a functional safety system

A method includes, storing a set of valid codewords including: a first valid functional codeword representing a functional state of a controller subsystem; a first valid fault codeword representing a fault state of the controller subsystem and characterized by a minimum hamming distance from the first valid functional codeword; a second valid functional codeword representing a functional state of a controller; and a second valid fault codeword representing a fault state of the controller; in response to detecting functional operation of the controller subsystem, storing the first valid functional codeword in a first memory; in response to detecting a match between contents of the first memory and the first valid functional codeword, outputting the second valid functional codeword; in response to detecting a mismatch between contents of the first memory and every codeword in the first set of valid codewords, outputting the second valid fault codeword.

Alarm notification system for robot
11491655 · 2022-11-08 · ·

An alarm notification system configured to assist an operator so that the operator can effectively carry out a teaching operation, etc. The alarm notification system includes: a storing section configured to, with respect to a past alarm which occurred when a program generated by a teach pendant was executed, store alarm data including a name of the program and a number of a line of the program when the alarm occurred; a judging section configured to judge as to whether or not an alarm prediction condition using the alarm data stored in the storing section is satisfied, when the program is executed again; and an alarm predicting section configured to notify the operator who is carrying out teaching of the robot of alarm information relating to the alarm, when the alarm prediction condition is satisfied.

Configuration drift management tool
11616692 · 2023-03-28 · ·

A system includes one or more databases configured to store at least one configuration rule and one or more processors in communication with the databases. The processors may be configured to compare a product parameter to configuration rules to determine a drift item based on a current value of the product parameter being different than acceptable values defined by a test specified by the configuration rule, the test comprising one of a plurality of test types. The processors may be further configured to store, based on a determination that the drift item is not in a drift database of the databases, the drift item in a database, receive a record of one or more actions performed to resolve the drift item, and in response to receipt of the record, modify a status of the drift item from unresolved to resolved in the database.

Time-based element management in a computer system using temporal node trees

A method, apparatus, system, and computer program product for managing time-based elements. A computer system identifies the time-based elements, wherein the time-based elements have time units. The computer system arranges the nodes, representing the time-based elements, in a temporal node tree, wherein the nodes have the time units from corresponding time-based elements and a policy that defines scaling of the nodes based on the time units allocated to the nodes.

Diagnosing anomalies detected by black-box machine learning models

A computer-implemented method, a computer program product, and a computer system for diagnosing anomalies detected by a black-box machine learning model. A computer determines a local variance of a test sample in a test dataset, where the local variance represents uncertainty of a prediction by the black-box machine learning model. The computer initializes optimal compensations for the test sample, where the optimal compensations are optimal perturbations to test sample values of respective components of a multivariate input variable. The computer determines local gradients for the test sample. Based on the local variance and the local gradients, the computer updates the optimal compensations until convergences of the optimal compensations are reached. Using the optimal compensations, the computer diagnoses the anomalies detected by the black-box machine learning model.

Diagnosing anomalies detected by black-box machine learning models

A computer-implemented method, a computer program product, and a computer system for diagnosing anomalies detected by a black-box machine learning model. A computer determines a local variance of a test sample in a test dataset, where the local variance represents uncertainty of a prediction by the black-box machine learning model. The computer initializes optimal compensations for the test sample, where the optimal compensations are optimal perturbations to test sample values of respective components of a multivariate input variable. The computer determines local gradients for the test sample. Based on the local variance and the local gradients, the computer updates the optimal compensations until convergences of the optimal compensations are reached. Using the optimal compensations, the computer diagnoses the anomalies detected by the black-box machine learning model.