G06F11/0766

Fatal error logging in a memory device

Devices and techniques for fatal error logging in a memory device are described herein. For example, a read request can be received for a component of the memory device. A fatal error indication of an error that prevents correct execution of read request can be detected. Diagnostic information for the failure indication can be collected. A response to the read request can then be made with a portion of the diagnostic information as payload instead of the user data that would have occupied the payload had the read succeeded. Metadata in the response can be used to communicate an error code.

INFORMATION RECORDING METHOD, APPARATUS, AND DEVICE, AND READABLE STORAGE MEDIUM
20230214286 · 2023-07-06 ·

An information recording method, apparatus, and device, and a readable storage medium are provided. The method includes: when a server is started, determining a ring buffer in a Double Data Rate (DDR) of a Field-Programmable Gate Array (FPGA) acceleration card based on an OpenPower platform; determining a start address and an end address of the ring buffer and configuring the start address and the end address to the FPGA acceleration card; and during a running process of the server, recording preset debugging information to the ring buffer in real time, so as to perform fault location according to data in the ring buffer after a fault occurs in the server. According to the present application, during a running process of a server, preset debugging information is recorded using a DDR of an FPGA acceleration card; therefore, when a down fault causes a Central Processing Unit (CPU) error of a server, recording of debugging information can also be ensured, thereby facilitating fault location.

Method of verifying access of multi-core interconnect to level-2 cache
11550646 · 2023-01-10 · ·

The present disclosure provides a method and a system of verifying access by a multi-core interconnect to an L2 cache in order to solve problems of delays and difficulties in locating errors and generating check expectation results. A consistency transmission monitoring circuitry detects, in real time, interactions among a multi-core interconnects system, all single-core processors, an L2 cache and a primary memory, and sends collected transmission information to an L2 cache expectation generator and a check circuitry. The L2 cache expectation generator obtains information from a global memory precise control circuitry according to a multi-core consistency protocol and generates an expected result. The check circuitry is responsible for comparing the expected result with an actual result, thus implementing determination of multi-core interconnect's access accuracy to the L2 cache without delay.

Performing a decoding operation to simulate switching a bit of an identified set of bits of a data block
11551772 · 2023-01-10 · ·

A set of bits of a segment of a memory device that is associated with an unsuccessful first decoding operation can be identified. A discrepancy value for at least one bit of the set of bits can be calculated. It can be determined whether the discrepancy value calculated for the at least one bit of the set of bits corresponds to a correction capability of the failed decoding operation. In response to determining that the discrepancy value calculated for the at least one bit corresponds to the correction capability of the failed decoding operation, the at least one bit of the set of bits can be corrected by switching a value of the at least one bit.

PROGRAMMABLE SIGNAL AGGREGATOR
20230214292 · 2023-07-06 ·

In an embodiment, an electronic circuit includes: a plurality of signal channels; a signal collection circuit configured to determine an action of the electronic circuit based on channel signals from the plurality of signal channels; and a first signal management circuit coupled between the plurality of signal channels and the signal collection circuit, the first signal management circuit including: a set of internal registers, a set of user registers, and a decoder configured to program the set of internal registers based on a content of the set of user registers, where the first signal management circuit is configured to receive the channel signals via the plurality of signal channels, generate first aggregated signals based on the received channel signals and a content of the set of internal registers, and transmitting the first aggregated signals to the signal collection circuit.

Early boot event logging system
11550664 · 2023-01-10 · ·

An early boot debug system includes a first memory subsystem that includes boot instructions and a processing system that is coupled to the first memory subsystem. The processing system includes a primary processing subsystem, and a secondary processing subsystem that is coupled to the primary processing subsystem and a second memory subsystem. The secondary processing subsystem copies the boot instructions from the first memory subsystem to the second memory subsystem and executes the boot instructions from the second memory subsystem during a boot operation. The secondary processing subsystem then detects a first event during the execution of the boot instructions and, in response, generates a first event information. The secondary processing subsystem stores the first event information in the second memory subsystem to be retrieved on-demand by an administrator.

DATA TAPE MEDIA QUALITY VALIDATION AND ACTION RECOMMENDATION

Techniques for generating action recommendations for a data tape system are disclosed. A data tape system generates action recommendations for a data tape based on library-based metadata messages as well as a measured data quality value of the data tape. The system initiates an operation resulting in the data tape interacting with a media drive. A data tape library controller generates one or more metadata messages based on a result of a requested operation. The metadata message may include information regarding the type of error and a default recommended course of action. The system generates the recommended action for the data tape using a trained machine learning model.

METHOD FOR ENCODED DIAGNOSTICS IN A FUNCTIONAL SAFETY SYSTEM
20230006697 · 2023-01-05 ·

A method includes, storing a set of valid codewords including: a first valid functional codeword representing a functional state of a controller subsystem; a first valid fault codeword representing a fault state of the controller subsystem and characterized by a minimum hamming distance from the first valid functional codeword; a second valid functional codeword representing a functional state of a controller; and a second valid fault codeword representing a fault state of the controller; in response to detecting functional operation of the controller subsystem, storing the first valid functional codeword in a first memory; in response to detecting a match between contents of the first memory and the first valid functional codeword, outputting the second valid functional codeword; in response to detecting a mismatch between contents of the first memory and every codeword in the first set of valid codewords, outputting the second valid fault codeword.

IPS SOC PLL monitoring and error reporting

The systems and methods described herein provide the ability to detect a clocking element fault within an IC device and switch to an alternate clock. In response to detection of a fault in a phase-lock-loop (PLL) clocking element, the device may switch to an alternate clock so that error reporting logic can make forward progress on generating error message. The error message may be generated within an Intellectual Property (IP) cores (e.g., IP blocks), and may send the error message from the IP core to a system-on-a-chip (SOC), such as through an SOC Functional Safety (FuSA) error reporting infrastructure. In various examples, the clocking error may also be output to a hardware SOC pin, such as to provide a redundant path for error indication.

WATCHPOINTS FOR DEBUGGING IN A GRAPHICS ENVIRONMENT

An apparatus to facilitate watchpoints for debugging in a graphics environment is disclosed. The apparatus includes processing resources to perform graphics operations using a plurality of threads; and load store pipeline hardware circuitry coupled to the processing resources to: configure a watchpoint register with a value of a watchpoint address, the watchpoint address comprising an address of a memory location in the processor; receive a memory access request from a thread of the plurality of threads; determine, using the watchpoint register, whether the memory access request is requesting access to the watchpoint address; and responsive to the memory access request requesting access to the watchpoint address, return an exception payload to the thread, the exception payload comprising watchpoint details corresponding to the watchpoint address and a scoreboard identifier (SBID) associated with the memory access request.