G06F3/0646

Dynamic data placement for replicated raid in a storage system

A method is disclosed for destaging data to a storage device set that is arranged to maintain M replicas of the data, the storage device set having M primary storage devices and N secondary storage devices, the method comprising: detecting a destage event; and in response to the destage event, destaging the data item that is stored in a journal, the destaging including: issuing M primary write requests for storing the data item, each of the M primary write requests being directed to a different one of the M primary storage devices; in response to detecting that L of the primary write requests have failed, issuing L secondary write requests for storing the data item, each of the L secondary write requests being directed to a different secondary storage device; updating a bitmap to identify all primary and secondary storage devices where the data item has been stored.

Memory-fabric-based processor context switching system

A memory-fabric-based processor context switching system includes server devices coupled to a memory fabric. A first processing system in a first server device receives a request to move a process it is executing and, in response, copies first processing system context values to its first local memory system in the first server device, and generates a first data mover instruction that causes a first data mover device in the first server device to transmit the first processing system context values from the first local memory system to the memory fabric. A second processing system in a second server device generates a second data mover instruction that causes a second data mover device in the second server device to retrieve the first processing system context values from the memory fabric and provide the first processing system context values in a second local memory system included in the second server device.

Distribution from multiple servers to multiple nodes

The embodiments of the present disclosure disclose a computer-implemented method, a system, and a computer program product for distributing data on multiple servers to multiple nodes in a cluster. In the method, each of M servers is instructed to divide data thereon into N data segments. M and N are integers greater than one. The M servers are instructed to send NM data segments on the M servers to N nodes in a cluster concurrently. For each of the M servers, the N data segments are sent respectively to the N nodes. When any given node in the cluster receives a data piece of a data segment from a server of the M servers, the given node is instructed to transmit the received data piece to remaining nodes in the cluster other than the given node.

DATA STRUCTURE OF A PROTECTION STORE FOR STORING SNAPSHOTS
20210117274 · 2021-04-22 ·

A computer-implemented method comprising: maintaining a metadata object for a snapshot that is being loaded into a particular protection store in a destination data storage system, wherein the metadata object comprises a sequence of protection-store-specific hash values, wherein each protection-store-specific hash value of the sequence of protection-store-specific hash values corresponds to a respective data object that belongs to the snapshot; when all data objects that belong to the snapshot are in the particular protection store, storing the metadata object for the snapshot in the particular protection store.

METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR STORAGE MANAGEMENT
20210117088 · 2021-04-22 ·

Techniques involve: determining a first group of storage disks, a use rate of each storage disk of the first group of storage disks exceeding a first threshold, the first group of storage disks comprising a first group of storage blocks corresponding to a first redundant array of independent storage disk (RAID); allocating a second group of storage blocks corresponding to a second RAID from a second group of storage disks, the second group of storage blocks having the same size as that of the first group of storage blocks, a use rate of each storage disk of the second group of storage disks being under a second threshold; moving data in the first group of storage blocks to the second group of storage blocks; and releasing the first group of storage blocks from the first group of storage disks. Thus, use rates of the storage disks become more balanced.

METHOD AND SYSTEM FOR EFFICIENT CONTENT TRANSFER TO DISTRIBUTED STORES
20210096759 · 2021-04-01 ·

A method is performed by a computer system. The method includes receiving the source object; providing an upload queue; executing a chunk creation loop to create a first plurality of chunks and add the first plurality of chunks to the upload queue, each chunk in the first plurality of chunks containing a respective portion of the source object; implementing an upload thread pool of upload threads; and executing the upload threads to upload the source object to a distributed data store, including executing the upload threads to upload chunks from the first plurality of chunks to the distributed data store in parallel.

DYNAMIC DATA PLACEMENT FOR REPLICATED RAID IN A STORAGE SYSTEM
20210133030 · 2021-05-06 · ·

A method is disclosed for destaging data to a storage device set that is arranged to maintain M replicas of the data, the storage device set having M primary storage devices and N secondary storage devices, the method comprising: detecting a destage event; and in response to the destage event, destaging the data item that is stored in a journal, the destaging including: issuing M primary write requests for storing the data item, each of the M primary write requests being directed to a different one of the M primary storage devices; in response to detecting that L of the primary write requests have failed, issuing L secondary write requests for storing the data item, each of the L secondary write requests being directed to a different secondary storage device; updating a bitmap to identify all primary and secondary storage devices where the data item has been stored.

TECHNOLOGIES FOR STORAGE AND PROCESSING FOR DISTRIBUTED FILE SYSTEMS

Techniques for storage and processing for distributed file systems are disclosed. In the illustrative embodiment, padding is placed between data elements in a file to be stored on a distributed file system. The file is to be split into several objects in order to be stored in the distributed file system, and the padding is used to prevent a data element from being split across two different objects. The objects are stored on data nodes, which analyze the objects to determine which data elements are present in the object as well at the location of those objects. The location of the objects is saved on the data storage device, and those locations can be used to perform queries on the data elements in the object on the data storage device itself. Such an approach can reduce transfer of data elements from data storage to local memory of the data node.

SAMPLING FINGERPRINTS IN BINS
20210132838 · 2021-05-06 ·

In some examples, a system associates a plurality of buffers in a memory with respective multiple bins of a fingerprint index in persistent storage. The system computes fingerprints for incoming data units, and selects, based on an adaptive sampling indication, a subset of the fingerprints. The system adds fingerprint index entries corresponding to the selected subset of the fingerprints to a respective subset of the multiple bins, wherein adding a fingerprint index entry to a bin of the respective subset of the multiple bins comprises adding the fingerprint index entry to the buffer of the bin.

MULTIPART UPLOADING TO OBJECT-BASED STORAGE

A system may include a memory and a processor in communication with the memory configured to perform operations. The may operations include obtaining transaction logs in blocks from nodes of a data storage system. The operations may include, for each transaction log, splitting the transaction log into log entries, grouping log entries into groups associated with a same data source, and writing the log entries of the groups to empty blocks such that log entries from different groups do not share a same block. The operations may include identifying a same sequence of log entries from the written transaction logs and uploading first blocks of a first transaction log, including the same sequence of log entries, to an object-based storage without uploading second blocks of a second transaction log including the same sequence of log entries to the object-based storage.