G06F11/141

Performance and deadlock mitigation during a memory die fail storm

A method is described that includes processing, by a memory subsystem, a read memory command that is addressed to a first die of a memory device. The memory subsystem determines whether processing the read memory command failed to correctly read user data from the first die and, in response to determining that processing the read memory command failed to correctly read user data from the first die, determines whether the first die has failed. In response to determining that the first die has failed, the memory subsystem performs an abbreviated error recovery procedure to successfully perform the read memory command instead of a full error recovery procedure.

Storage System and Method for Data Recovery After Detection of an Uncorrectable Error

A storage system caches, in volatile memory, data read from non-volatile memory. After detecting an uncorrectable error in the data cached in the volatile memory, the storage system replaces the cached data with data re-read from the non-volatile memory and updated to reflect any changes made to the data after it was stored in the non-volatile memory. The storage system can also analyze a pattern in data adjacent to the uncorrectable error and predict corrected data based on the pattern.

TECHNOLOGIES FOR SWITCHING NETWORK TRAFFIC IN A DATA CENTER

Technologies for switching network traffic include a network switch. The network switch includes one or more processors and communication circuitry coupled to the one or more processors. The communication circuity is capable of switching network traffic of multiple link layer protocols. Additionally, the network switch includes one or more memory devices storing instructions that, when executed, cause the network switch to receive, with the communication circuitry through an optical connection, network traffic to be forwarded, and determine a link layer protocol of the received network traffic. The instructions additionally cause the network switch to forward the network traffic as a function of the determined link layer protocol. Other embodiments are also described and claimed.

Memory storage apparatus with protection of command data in a host buffer in response to a system abnormality
11614997 · 2023-03-28 · ·

A method for managing a host memory buffer, a memory storage apparatus, and a memory control circuit unit are provided. The method includes: detecting whether a system abnormality occurs; copying a first command and first data corresponding to the first command stored in a data buffer of a host system to the memory storage apparatus in response to determining that the system abnormality occurs; executing an initial operation after copying the first command and the first data, wherein the initial operation initializes a part of a hardware circuit in the memory storage apparatus and does not initialize another part of the hardware circuit in the memory storage apparatus; and re-executing the first command stored in the memory storage apparatus after initializing the part of the hardware circuit.

Storage device and method of operating the same
11487627 · 2022-11-01 · ·

A storage device having improved data recovery performance includes a memory device including a first storage region and a second storage region, and a memory controller that controls the memory device. Before performing a write operation in the first storage region, the memory controller may backup data previously stored in the first storage region, based on a fail probability of the write operation to be performed in the first storage region. If the write operation fails, the previously-stored data may be recovered from where it was backed up.

METHOD AND SYSTEM FOR ENSURING FAILURE ATOMICITY IN NON-VOLATILE MEMORY
20220334918 · 2022-10-20 ·

Disclosed in the present invention are a method and a system for ensuring the failure atomicity in a non-volatile memory, which belong to the field of computer storage. The method comprises: executing transactions encapsulated by one or more operations that need to ensure the failure atomicity in accordance with the following steps: executing operations in a current transaction in sequence, for each write operation in the current transaction, determining whether the oldest value of its corresponding data is saved into a log of the non-volatile memory, if so, then creating an UndoRedo log entry for it, otherwise creating a Redo log entry for it; using corresponding log management strategies according to types of log entries; after all operations are executed, committing the current transaction; and completing the execution of the current transaction; wherein information recorded in the UndoRedo log entries comprises: transaction number, write operation address, and oldest value and new value of corresponding data; and information recorded in the Redo log entries comprises: transaction number, write operation address, and new value of corresponding data. The present invention can reduce the overhead caused by ensuring the failure atomicity in the NVMM.

Handling guard tag loss

An apparatus comprising memory access circuitry to perform a tag-guarded memory access in response to a received target address and methods of operation of the same are disclosed. In the tag-guarded memory access a guard-tag retrieval operation seeks to retrieve a guard tag stored in association with a block of one or more memory locations comprising an addressed location identified by the received target address, and a guard-tag check operation compares an address tag associated with the received target address with the guard tag retrieved by the guard-tag retrieval operation. When the guard-tag retrieval operation is unsuccessful in retrieving the guard tag, a substitute guard tag value is stored as the guard tag in association with the block of one or more memory locations comprising the addressed location identified by the target address.

A MULTI-PART COMPARE AND EXCHANGE OPERATION
20230116945 · 2023-04-20 ·

A method for executing an atomic compare and exchange operation, the method may include processing a compare command and a conditional exchange command while considering hardware failures.

INFERENCE CALCULATION FOR NEURAL NETWORKS WITH PROTECTION AGAINST MEMORY ERRORS

A method for operating a hardware platform for the inference calculation of a layered neural network. In the method: a first portion of input data which are required for the inference calculation of a first layer of the neural network and redundancy information relating to the input data are read in from an external working memory into an internal working memory of the computing unit; the integrity of the input data is checked based on the redundancy information; in response to the input data here being identified as error-free, the computing unit carries out at least part of the first-layer inference calculation for the input data to obtain a work result; redundancy information for the work result is determined, based which the integrity of the work result can be verified; the work result and the redundancy information are written to the external working memory.

SYSTEM AND METHOD FOR FAULT IDENTIFICATION AND FAULT HANDLING IN A MULTIPORT POWER SOURCING DEVICE
20230068583 · 2023-03-02 ·

System and method for fault identification and fault handling in MPSD are provided. The system includes: a multi-port power sourcing device including multiple ports, a master is configured to: send a slave discovery request to multiple slave ports, receive a slave discovery response from the multiple slave ports; reset the watchdog timer in the multiple ports by sending watchdog refresh instruction periodically; each of the multiple ports experience watchdog timer timeout upon failing to receive the watchdog refresh instruction, generate their corresponding port reset upon watchdog timer timeout, to resolve one or more faults associated with the corresponding port; the multiple ports include a role change staggered timer which is triggered upon the corresponding watchdog timer timeout, and reset upon receiving the watchdog refresh instruction from master; the slave ports for which role change staggered timer times out first, changes the role to start functioning as the new master port.