G06F11/0703

Reporting control information errors

Methods, systems, and devices for reporting control information errors are described. A state of a memory array may be monitored during operation. After detecting an error (e.g., in received control information), the memory device may enter a first state (e.g., a locked state) and may indicate to a host device that an error was detected, the state of the memory array before the error was detected, and/or at least a portion of a control signal carrying the received control information. The host device may diagnose a cause of the error based on receiving the indication of the error and/or the copy of the control signal. After identifying and/or resolving the cause of the error, the host device may transmit one or more commands (e.g., unlocking the memory device and returning the memory array to the original state) based on receiving the original state from the memory device.

Memory error detection
11579965 · 2023-02-14 · ·

Systems and methods are provided for detecting and correcting address errors in a memory system. In the memory system, a memory device generates an error-detection code based on an address transmitted via an address bus and transmits the error-detection code to a memory controller. The memory controller transmits an error indication to the memory device in response to the error-detection code. The error indication causes the memory device to remove the received address and prevent a memory operation.

LOG ANALYZER FOR FAULT DETECTION
20230011129 · 2023-01-12 ·

Apparatuses and methods for anomaly detection. In one embodiment, a method is implemented in a computing device for building a tree structure to represent a system behavior includes obtaining one or more training log records; and building a tree structure using the one or more training log records. The tree structure includes a plurality of tree nodes. Each successive tree node in a root-to-leaf path of the tree structure representing successive log elements of the one or more training log records. Each of the one or more training log records includes one or more log elements. In one embodiment, a method implemented in a computing device for fault detection includes obtaining a live log record and determining an anomaly in the live log record by comparing corresponding successive elements of the live log record to successive nodes in a root-to-leaf direction of the tree structure.

METHOD AND SYSTEM FOR AUTOMATED HEALING OF HARDWARE RESOURCES IN A COMPOSED INFORMATION HANDLING SYSTEM

In general, the invention relate to providing computer implemented services using information handling systems. One or more embodiments includes after being allocated to a composed information handling system of the composed information handling systems: monitoring health of a hardware resource of the composed information handling system, making a determination, based on the monitoring of the health of the hardware resource, that the hardware resource is in a compromised state, and based on the determination, initiating a hardware replacement operation using replacement option information (ROI) for the hardware resource and replacement conditions for the hardware resource.

Automatically predicting device failure using machine learning techniques

Methods, apparatus, and processor-readable storage media for automatically predicting device failure using machine learning techniques are provided herein. An example computer-implemented method includes obtaining telemetry data from at least one client device; predicting failure of at least a portion of the at least one client device by processing at least a portion of the telemetry data using a first set of one or more machine learning techniques; predicting lifespan information pertaining to at least a portion of the at least one client device by processing the predicted failure and at least a portion of the telemetry data using a second set of one or more machine learning techniques; and performing at least one automated action based at least in part on one or more of the predicted failure and the predicted lifespan information.

System and method for detecting anomalies by discovering sequences in log entries
11513935 · 2022-11-29 · ·

A method for detecting an anomaly includes retrieving a log file that includes log entries, grouping the log entries into clusters of log entry types based on number of occurrences and average time interval, and discovering a sequence of the log entry types within each of the clusters. The sequence of the log entry types is based on a shortest path from a first one of the log entry types to a last one of the log entry types.

Visualization of outliers in a highly-skewed distribution of telemetry data
11609704 · 2023-03-21 · ·

Systems and methods for enhancing the representation of outliers in a distribution of telemetry data of a monitored system are provided. According to one embodiment, telemetry data of the monitored system may be continuously collected. Frequency values representing a frequency of occurrence of corresponding telemetry data of the collected telemetry data may be generated by aggregating the collected telemetry data. As the vast majority of telemetry data is expected to represent a normal operating state of the system and relatively few, if any, of the telemetry data (e.g., outliers) will be indicative of one or more events of significance, the resulting distribution of the frequency values is highly skewed. In order to facilitate visualization of the distribution that accentuates the outliers, display characteristics may be calculated for the frequency values by applying a visualization model based on a weighted combination of multiple data transformations to each of the frequency values.

Data storage system data access arbitration

A data storage system can organize a semiconductor memory into a first data set and a second data set with a first queue populated with a first data access request from a host. An assignment of an arbitration weight to the first queue with an arbitration circuit corresponds with the first queue being skipped during a deterministic window based on the arbitration weight.

Concept for Handling Transient Errors
20230124832 · 2023-04-20 ·

Examples relate to a concept for handling transient errors. An apparatus for correcting transient errors in a computational device comprises interface circuitry, machine-readable instructions and processing circuitry for executing the machine-readable instructions to obtain a signal indicating that a transient error has been detected in the computational device, the computational device being configured to perform computations using processing elements and connections between the processing elements, extract a state of the computational device, the state comprising at least one of present and previous values transmitted via the connections between the processing elements and state contained within the one or more processing elements, compute a corrected state of the computational device based on the state extracted from the computational device, and configure a computational device with the corrected state.

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING SYSTEM AND PROGRAM

An information processing device includes: an acquisition unit configured to acquire a determination result of a state of a user, who has given a transmission job execution instruction, determined based on biological information of the user; and a job control unit configured to control an execution of the transmission job according to the user state determination result, wherein when it is determined that the user is in an off-normal state, the job control unit executes a confirmation request process to request the user to make a confirmation related to the transmission job.