G06F16/174

Encoding / Decoding System and Method
20230214353 · 2023-07-06 ·

A computer-implemented method, computer program product and computing system for: processing an unencoded data file to identify a plurality of file segments, wherein the unencoded data file is a dataset for use with a blockchain process; mapping each of the plurality of file segments to a portion of a dictionary file to generate a plurality of mappings that each include a starting location and a length, thus generating a related encoded data file based, at least in part, upon the plurality of mappings; receiving a request to manipulate the unencoded data file from the blockchain process; and processing the related encoded data file based, at least in part, upon the plurality of mappings and the dictionary file to generate a modified encoded data file that represents the requested manipulations of the unencoded data file.

High performance space efficient distributed storage
11550755 · 2023-01-10 · ·

High performance space efficient distributed storage is disclosed. For example, a distributed storage volume (DSV) is deployed on a plurality of hosts, with a first host storing a local cache, and a storage controller executing on a processor of the first host receives a request to store a first file. The first file is stored to the local cache. The DSV is queried to determine whether a second file that is a copy of the first file is stored in the DSV. In response to determining that the DSV lacks the second file, the first file is transferred from the local cache to the DSV and then replicated to a second host of the plurality of hosts. In response to determining that the second file resides in the DSV, a reference to the second file is stored in the DSV and then replicated to the second host.

System and method for performing an antivirus scan using file level deduplication

Aspects of the disclosure describe methods and systems for performing an antivirus scan using file level deduplication. In an exemplary aspect, prior to performing an antivirus scan on files stored on at least two storage devices, a deduplication module calculates a respective hash for each respective file stored on the storage devices. The deduplication module identifies a first file stored the storage devices and determines whether at least one other copy of the first file exists on the storage devices. In response to determining that another copy exists, the deduplication module stores the first file in a shared database, replaces all copies of the first file on the storage devices with a link to the first file in the shared database, and performs the antivirus scan on (1) the first file in the shared database and (2) the files stored on the storage devices.

METHOD AND SYSTEM FOR FACILITATING DISTRIBUTED ENTITY RESOLUTION

A method for providing data blocking to facilitate distributed entity resolution is disclosed. The method includes receiving data sets from a source, the data sets including records that correspond to an entity; grouping each of the records into a block based on a shared characteristic, the block including a blocking key; converting the block into a data file, the data file corresponding to a predetermined file format; partitioning the data file based on the corresponding blocking key; determining, via a worker node, a potential record pair by using the partitioned data file; and persisting the potential record pair.

FILE DE-DUPLICATION FOR A DISTRIBUTED DATABASE

A device configured to identify a file in a network device, to generate a first set of block hash codes for data blocks for a first instance of the file, and to generate a second set of block hash codes for data blocks for a second instance of the file. The device is further configured to determine the first set of block hash codes matches the second set of block hash codes and to generate an entry in a file list for the instances of the file. The device is further configured to count the number of entries that are associated with the file and to determine the number of entries is greater than the redundancy threshold value. The device is further configured to delete one or more instances of the file in response to determining that the number of entries is greater than the redundancy threshold value.

System and method for error-resilient data reduction

A system and method for error-resilient data reduction, utilizing a phase detector, a data requestor, a multi-phase trainer, a reconstruction engine, a deconstruction engine, and one or more reference codebooks. A multi-phase trainer may be used to train the reconstruction and deconstruction engines on various phase sourceblocks in order recover quickly from corrupted data files that cause the phase alignment of the sourceblocks to become out of phase. A phase detector may determine when the sourceblocks get out of phase and when the return to in-phase by checking if a predetermined threshold probability of correct encoding is met. Data requestor may request for retransmission only the data that was received out of phase.

Unique ID generation for sensors

Systems, methods, and computer-readable media are provided for generating a unique ID for a sensor in a network. Once the sensor is installed on a component of the network, the sensor can send attributes of the sensor to a control server of the network. The attributes of the sensor can include at least one unique identifier of the sensor or the host component of the sensor. The control server can determine a hash value using a one-way hash function and a secret key, send the hash value to the sensor, and designate the hash value as a sensor ID of the sensor. In response to receiving the sensor ID, the sensor can incorporate the sensor ID in subsequent communication messages. Other components of the network can verify the validity of the sensor using a hash of the at least one unique identifier of the sensor and the secret key.

Unique ID generation for sensors

Systems, methods, and computer-readable media are provided for generating a unique ID for a sensor in a network. Once the sensor is installed on a component of the network, the sensor can send attributes of the sensor to a control server of the network. The attributes of the sensor can include at least one unique identifier of the sensor or the host component of the sensor. The control server can determine a hash value using a one-way hash function and a secret key, send the hash value to the sensor, and designate the hash value as a sensor ID of the sensor. In response to receiving the sensor ID, the sensor can incorporate the sensor ID in subsequent communication messages. Other components of the network can verify the validity of the sensor using a hash of the at least one unique identifier of the sensor and the secret key.

Efficient mechanism to perform auto retention locking of files ingested via distributed segment processing in deduplication backup servers

A command requesting creation of a backup file and issued by a client-side deduplication library is received. Upon creating the file, a first flag is set on the file indicating that the file should be automatically retention locked after a cooling off period has elapsed. During the cooling off period, a command requesting that the file be opened for writes is received. The first flag is cleared to exclude the file from being automatically retention locked after the cooling off period has elapsed. A second flag is set on the file indicating that writes to the file are in progress. A command requesting that the file be closed, the writes to the backup file thereby being complete, is received. The second flag is cleared. The first flag is reset to allow the file to be automatically retention locked after the cooling off period has elapsed.

Cooperative access method, system, and architecture of external storage

The present disclosure provides a cooperative access method, system, and architecture of an external storage. The method includes: pre-storing image compression configuration information and image decompression configuration information corresponding to an access address of a read and write operation of an image processing device; compressing an image data and storing the compressed data to an external storage based on an access address of a write operation of an image processing device and the image compression configuration information; decompressing the compressed data and sending the decompressed data to the image processing device based on an access address of a read operation of the image processing device and the image decompression configuration information, which compresses the image data and stores it in the external storage, decompresses compressed data and returns it to the image processing device, thereby reducing the space requirements for external storage, which improves the overall system performance.