G06F16/1748

Deduplication of encrypted data using multiple keys

Deduplication of encrypted data using multiple keys includes responding to a request to store a predetermined set of data in an electronic data store by receiving a hash corresponding to the predetermined set of data, receiving encrypted data generated by encrypting the predetermined set of data using an encryption key, and receiving a key index corresponding to the encryption key. The hash may be determined to match a previously stored hash, the previously stored hash indicating that a previously encrypted version of the predetermined set of data is stored at a physical location in the electronic data store. Based on determining that the hash matches a previously stored hash, the hash, encrypted data, and key index are discarded.

Deduplication of encrypted data

A data storage system configured to deduplicate and store sets of data is presented. The system comprises a computer readable storage device configured to store a plurality of sets of data for a plurality of hosts, wherein each sets of data of the plurality of sets of data corresponding to each host of the plurality of hosts is encrypted with one or more different encryption keys, and wherein at least one of the plurality of sets of data contains deduplicated data. The system also comprises a key translator configured to create at least one translation key based, at least in part, on the one or more different encryption keys and the deduplicated data, and wherein the at least one translation key is configured to translate from a first encryption key to a second encryption key of the one or more different encryption keys.

Similarity deduplication
11615063 · 2023-03-28 · ·

Dictionary-based compression is performed to compress data units using a similar data unit as the base unit (i.e., dictionary) for each candidate data unit. Similarity may be determined between data units by applying a locality-sensitive hashing scheme to each candidate data unit to produce a hash value, and by determining whether there is a matching value in a hash index of hash values for existing data units on the system. If there is a matching hash value, the candidate data unit may be compressed using the data unit corresponding to the matching hash value as the dictionary. Only a representative portion of the data unit may be hashed to produce the hash value, the portion comprised of chunks of the data unit, where each chunk is a continuous, uninterrupted section of data. The chunks themselves may not be (in some embodiments likely are not) contiguous to one another.

SYSTEM, METHOD AND ARTICLE OF MANUFACTURE FOR SYNCHRONIZATION-FREE TRANSMITTAL OF NEURON VALUES IN A HARDWARE ARTIFICIAL NEURAL NETWORKS
20230086636 · 2023-03-23 ·

Computations in Artificial neural networks (ANNs) are accomplished using simple processing units, called neurons, with data embodied by the connections between neurons, called synapses, and by the strength of these connections, the synaptic weights. Crossbar arrays may be used to represent one layer of the ANN with Non-Volatile Memory (NVM) elements at each crosspoint, where the conductance of the NVM elements may be used to encode the synaptic weights, and a highly parallel current summation on the array achieves a weighted sum operation that is representative of the values of the output neurons. A method is outlined to transfer such neuron values from the outputs of one array to the inputs of a second array with no need for global clock synchronization, irrespective of the distances between the arrays, and to use such values at the next array, and/or to convert such values into digital bits at the next array.

ELASTIC, EPHEMERAL IN-LINE DEDUPLICATION SERVICE

A deduplication service can be provided to a storage domain from a services framework that expands and contracts to both meet service demand and to conform to resource management of a compute domain. The deduplication service maintains a fingerprint database and reference count data in compute domain resources, but persists these into the storage domain for use in the case of a failure or interruption of the deduplication service in the compute domain. The deduplication service responds to service requests from the storage domain with indications of paths in a user namespace and whether or not a piece of data had a fingerprint match in the fingerprint database. The indication of a match guides the storage domain to either store the piece of data into the storage backend or to reference another piece of data. The deduplication service uses the fingerprints to define paths for corresponding pieces of data.

SIMILARITY DATA FOR REDUCED DATA USAGE
20230088163 · 2023-03-23 ·

In one implementation, a method includes identifying a first content-dependent feature associated with a data sector. The method further includes determining a baseline data sector associated with the data sector. The method further includes determining, by a processing device, a content-dependent delta between the first content-dependent feature and a second content-dependent feature of the baseline data sector. The method further includes providing the content-dependent delta and an indicator to the baseline data sector for storage on a plurality of storage devices.

METHOD AND APPARATUS FOR REPLICATING A TARGET FILE BETWEEN DEVICES

There is provided a method and apparatus for remote differential compression (RDC) and data deduplication. According to embodiments, when a sending device acquires a new target file, the following steps are performed. Initially, Jaccard segmentation is performed, followed by performing identity-based segment deduplication and similarity-based segment deduplication. The transmission of the target file in the deduplicated form to the recipient device is subsequently performed. The recipient device can then rebuild the original target file from the deduplicated form thus replicating the target file at the recipient device with the target file originally present at the sending device.

A DATA EXTRACTION METHOD

Described herein is a method (100) of extracting data from a dataset of files stored in a database (109). The method (100) including step (101) of executing a conversion procedure to convert the dataset of files into a plurality (N) of structured binary files. At step (102) the structured binary files are stored in memory. At step (103) a query is received from a user input to extract queried data from the dataset. The query includes a plurality of query arguments. At step (104), the query arguments are input to a data query procedure. The query procedure includes the substeps of: (104a) accessing the structured to binary files in memory; (104b) loading a reference data structure into memory, the reference data structure specifying a list of data classes; (104c) executing a data query algorithm to retrieve a subset of the data determined by the query arguments; and (104d) returning the subset of the data as one or more files having a predetermined file type.

MEMORY OPTIMIZED ALGORITHM FOR EVALUATING DEDUPLICATION HASHES FOR LARGE DATA SETS
20230090289 · 2023-03-23 ·

One example method includes performing a hash of data to generate a hash value, checking a binary trie to determine if the hash value has previously been entered into the binary trie, if the hash value has previously been entered in the binary trie, declaring the data as a duplicate of other data, and if the hash value has not been previously entered in the binary trie, updating the binary trie to include the hash value.

Server for ingesting and rendering data objects

Systems and methods are provided to ingest and integrate data objects for use in one or more system operations including providing a renderable data object to a user and updating a data item database.