G06F3/067

Technologies for providing shared memory for accelerator sleds

Technologies for providing shared memory for accelerator sleds includes an accelerator sled to receive, with a memory controller, a memory access request from an accelerator device to access a region of memory. The request is to identify the region of memory with a logical address. Additionally, the accelerator sled is to determine from a map of logical addresses and associated physical address, the physical address associated with the region of memory. In addition, the accelerator sled is to route the memory access request to a memory device associated with the determined physical address.

SYSTEMS, METHODS, AND APPARATUS FOR REMOTE DATA TRANSFERS TO MEMORY
20230044165 · 2023-02-09 ·

A method may include receiving, at a target, from a server, a command, information to identify data, and access information to perform a data transfer using a memory access protocol, and performing, based on the command, based on the access information, the data transfer between the target and a client using the memory access protocol. The information to identify the data may include an object key, and the object key and the access information may be encoded, at least partially, in an encoded object key. The method may further include sending, based on the data transfer, from the target to the server, a completion. The method may further include sending, based on the completion, from the server to the client, an indication of success. The method may further include reconstructing the data based on the parity data.

CONSTANT TIME UPDATES AFTER MEMORY DEDUPLICATION
20230040039 · 2023-02-09 ·

Systems and methods are described for resource-efficient memory deduplication and write-protection. In an example, a method includes receiving, by a computing device having a processor, a request to assess deduplication for a plurality of candidate files. The computing device may perform one or more iterative steps for deduplication. The iterative steps may include: receiving, from the plurality of candidate files, a candidate file that is not write-protected; determining, based on a predetermined Bernoulli distribution, a decision to write-protect the candidate file; rendering the candidate file as a write-protected candidate file; determining, based on a review of other candidate files from the plurality of candidate files, that the write-protected candidate file can be deduplicated; and deduplicating the write-protected candidate file.

PARALLEL READS OF DATA STAGING TABLE

Systems and methods to read records of a data staging table, where each record of the data staging table is associated with a package identifier, a key value of a record of a first database table, values of one or more non-key fields of the record of the first database table, and a database operation, include reading of one or more records of the data staging table, each of the read one or more records associated with a package identifier indicating the record is not being processed, and not including a same key value as any other record of the data staging table associated with a package identifier indicating the record is being processed, updating the package identifier of each of the read records of the data staging table to a first package identifier indicating that the record is being processed, creating a transaction record of a transaction queue associating the data staging table and the first package identifier, determining that the read one or more records have been processed, and, in response to the determination, deleting the one or more read rows from the data staging table and the transaction record.

SYSTEM AND METHOD FOR DATA COMPACTION UTILIZING MISMATCH PROBABILITY ESTIMATION

A system and method for compacting data that uses mismatch probability estimation to improve entropy encoding methods to account for, and efficiently handle, previously-unseen data in data to be compacted. Training data sets are analyzed to determine the frequency of occurrence of each sourceblock in the training data sets. A mismatch probability estimate is calculated comprising an estimated frequency at which any given data sourceblock received during encoding will not have a codeword in the codebook. Entropy encoding is used to generate codebooks comprising codewords for data sourceblocks based on the frequency of occurrence of each sourceblock. A “mismatch codeword” is inserted into the codebook based on the mismatch probability estimate to represent those cases when a block of data to be encoded does not have a codeword in the codebook. During encoding, if a mismatch occurs, a secondary encoding process is used to encode the mismatched sourceblock.

DATA LINEAGE IN A DATA PIPELINE
20230041906 · 2023-02-09 ·

Various embodiments comprise systems and methods to monitor operations of a data pipeline. In some examples, a data pipeline receives data inputs, processes the data inputs, and responsively generates and transfers data outputs. Data monitoring circuitry monitors the operations of the data pipeline circuitry, identifies an input change between an initial one of the data inputs and a subsequent one of the data inputs, and identifies an output change between an initial one of the data outputs and a subsequent one of the data outputs. The data monitoring circuitry correlates the input change to the output change, determines a quality threshold for the output change based on the correlation, and determines when the output change falls below the quality threshold. When the output change falls below the quality threshold, the data monitoring circuitry generates and transfers an alert that indicates the input change and the output change.

Secure and transparent pruning for blockchains
11556247 · 2023-01-17 · ·

A method for enabling pruning of a blockchain of a blockchain network includes creating an active blocks commitments Merkle tree from hashes of active blocks and creating an active smart contracts commitments Merkle tree from hashes of active smart contracts. The Merkle trees are created after an amount of blocks created in the blockchain has reached a threshold set by a pruning threshold parameter stored in the blockchain network. Hashes of the roots of the Merkle trees are stored in a header of a new block as a new genesis block. The new genesis block is broadcast to the blockchain network. A set of the active blocks and active smart contracts used respectively to create the active blocks commitments Merkle tree and the active smart contracts commitments Merkle tree are committed to upon the blockchain network reaching consensus on the new genesis block.

Multiple data labels within a backup system

Embodiments for a method performing data migration such as backups and restores in a network by identifying characteristics of data in a data saveset to separate the data into defined types based on respective characteristics, assigning a data label to each defined type by receiving user selection or automatically merging or selecting a priority label, from among many labels associated with a file, defining migration rules for each data label, discovering assigned labels during a migration operation; and applying respective migration rules to labeled data in the data saveset. The migration rules can dictate storage location, access rights, replication periods, retention periods, and similar parameters.

Data transformation for a machine learning model

Data transformation caching in an artificial intelligence infrastructure that includes one or more storage systems and one or more graphical processing unit (‘GPU’) servers, including: identifying, in dependence upon one or more machine learning models to be executed on the GPU servers, one or more transformations to apply to a dataset; generating, in dependence upon the one or more transformations, a transformed dataset; storing, within one or more of the storage systems, the transformed dataset; receiving a plurality of requests to transmit the transformed dataset to one or more of the GPU servers; and responsive to each request, transmitting, from the one or more storage systems to the one or more GPU servers without re-performing the one or more transformations on the dataset, the transformed dataset.

Fragmentation measurement solution
11556256 · 2023-01-17 · ·

A degree of fragmentation is determined based on a number of holes present in a storage system layout or a portion of a layout. Edges between the holes and used portions of the storage system are tabulated by scanning a storage space. The occurrences of a pattern of used/available allocation units and/or the occurrences of another pattern available/used allocation units are recognized. A fragmentation value is calculated based on occurrences of the patterns in view of the total storage space. The present fragmentation measurement system utilizes the number of occurrences of the holes in assessing fragmentation.