G06F11/16

Memory-based distributed processor architecture
11301340 · 2022-04-12 · ·

Distributed processors and methods for compiling code for execution by distributed processors are disclosed. In one implementation, a distributed processor may include a substrate; a memory array disposed on the substrate; and a processing array disposed on the substrate. The memory array may include a plurality of discrete memory banks, and the processing array may include a plurality of processor subunits, each one of the processor subunits being associated with a corresponding, dedicated one of the plurality of discrete memory banks. The distributed processor may further include a first plurality of buses, each connecting one of the plurality of processor subunits to its corresponding, dedicated memory bank, and a second plurality of buses, each connecting one of the plurality of processor subunits to another of the plurality of processor subunits.

METHOD AND APPARATUS TO NEUTRALIZE REPLICATION ERROR AND RETAIN PRIMARY AND SECONDARY SYNCHRONIZATION DURING SYNCHRONOUS REPLICATION
20220100600 · 2022-03-31 ·

Techniques are provided for neutralizing replication errors. An operation is executed upon a first storage object and is replicated as a replicated operation for execution upon a second storage object. A first error may be received for the replicated operation. Instead of transitioning to an out of sync state and aborting the operation, a wait is performed until a result of the attempted execution of the operation is received. If the first error is the same as a second error returned for the operation, then the operation and replicated operation are considered successful and a synchronous replication relationship is kept in sync. If the first error and the second error are different errors, then an error response is returned for the operation and the synchronous replication relationship is transitioned to out of sync.

Highly resilient synchronous replication with automatic recovery

In one aspect, automatic recovery of a synchronous replication session in response to an error is provided for a storage system that includes a source and target sites. During an active sync replication session in which a state machine indicates the system is operating in sync, an aspect includes monitoring input/output (IO) operations. Upon determining an occurrence of the error in which data has been persisted at the source site but not at the target site, an aspect includes discontinuing replication to the target site and transitioning the state machine from a sync state to a tripped state. Upon determining, during the tripped state, resources exist to conduct sync replication remote data transfer operations, transition the state machine to an async_to_sync state. The async_to_sync state causes the storage system to initiate a recovery operation to return the source and target sites to the sync state.

Highly resilient synchronous replication with automatic recovery

In one aspect, automatic recovery of a synchronous replication session in response to an error is provided for a storage system that includes a source and target sites. During an active sync replication session in which a state machine indicates the system is operating in sync, an aspect includes monitoring input/output (IO) operations. Upon determining an occurrence of the error in which data has been persisted at the source site but not at the target site, an aspect includes discontinuing replication to the target site and transitioning the state machine from a sync state to a tripped state. Upon determining, during the tripped state, resources exist to conduct sync replication remote data transfer operations, transition the state machine to an async_to_sync state. The async_to_sync state causes the storage system to initiate a recovery operation to return the source and target sites to the sync state.

Data storage device with spare blocks for replacing bad block in super block and operating method thereof
11275678 · 2022-03-15 · ·

A data storage device includes a memory device and a controller. The memory device includes a plurality of planes, wherein each of the planes includes two or more memory blocks. The controller is configured to control an operation of the memory device. The controller is further configured to generate a first super block as a super block including two or more way-interleavable memory blocks among the plurality of memory blocks of the plurality of planes, determine whether each of the memory blocks included in the first super block is a bad block, retrieve a spare block for replacing a first memory block determined as a bad block, in the plurality of planes; and generate a second replacing super block as a super block in which the first memory block is replaced with a second memory block positioned in a plane which does not have the first memory block, when all spare blocks of a plane including the first memory block are used.

Storage systems configured for storage volume addition in synchronous replication using active-active configuration
11275765 · 2022-03-15 · ·

An apparatus in one embodiment comprises at least one processing device comprising a processor coupled to a memory. The at least one processing device is configured to identify a storage volume to be added to a first consistency group of a first synchronous replication session between a first storage system and a second storage system in an active-active configuration, to create a second synchronous replication session for the added storage volume between the first storage system and the second storage system, and to merge the first and second synchronous replication sessions responsive to one or more designated criteria. The second synchronous replication session is illustratively configured to be fully independent of the first synchronous replication session. Merging the first and second synchronous replication sessions responsive to one or more designated criteria illustratively comprises merging the first and second synchronous replication sessions responsive to the second synchronous replication session reaching a specified steady state.

Method for Operating a Redundant Automation System
20220091946 · 2022-03-24 ·

Method for operating a redundant automation system to control a technical process, wherein a second fail-safe subsystem is operated redundantly in relation to a first fail-safe subsystem, and wherein the faulty second fail-safe subsystem is used, where synchronization data is initially buffered in the second subsystem, and in the event that no errors are identified, the first fail-safe subsystem sends an error-free message to the second fail-safe subsystem to acknowledge the error-free message with an error free acknowledgment and process the initially buffered synchronization data.

Decentralized and distributed continuous replication system for moving devices

A replication system for data of mobile devices is disclosed. The data of a mobile device is uploaded to stations in an area. Metadata associated with the objects is stored in a centralized or decentralized system. The metadata can be accessed to identify the stations storing the device's objects and the data of the mobile device can then be retrieved from the stations and reconstructed.

Method and apparatus for performing dynamic recovery management regarding redundant array of independent disks
11301326 · 2022-04-12 · ·

A method and apparatus for performing dynamic recovery management regarding a RAID are provided. The method includes: writing a first set of protected data into a first protected access unit of multiple protected access units of the RAID, and recording a first set of management information corresponding to the first set of protected data, for data recovery of the first set of protected data; and when any storage device of multiple storage devices of the RAID malfunctions, writing a second set of protected data into a second protected access unit of the protected access units, and recording a second set of management information corresponding to the second set of protected data, for data recovery of the second set of protected data. Any set of the first set of protected data and the second set of protected data includes data and multiple parity-check codes.

Enabling high availability in server SAN enabled storage box

A computer system has a first node including a first baseboard management controller (BMC) and a first host of the first BMC. The first node determines that the first node is an active node. The first node operates a first storage service at the first host. The first host is a first storage device connected to one or more storage drives. The first storage service manages a first Remote Direct Memory Access (RDMA) controller for accessing user data stored on the one or more storage drives. The first node indicates to a second node that the first node is operating normally. The first node syncs data available on the first node with the second node.