Patent classifications
G06F16/1756
DATA BACKUP METHOD, DATA BACKUP DEVICE, AND COMPUTER PROGRAM PRODUCT
Embodiments of the present disclosure relate to a data backup method, a data backup device, and a computer program product. The method includes: determining delta data based on previous data and current data of a storage system; determining a delta data block subset in a delta data block set; sending delta index information and delta reference information associated with delta data blocks in the delta data block subset to a backup storage system; and sending, to the backup storage system, a further delta data block subset including delta data blocks in the delta data block set other than the delta data block subset. With the technical solution of the present disclosure, the amount of data transmission, the amount of computation, and the usage of a processing unit and a memory when backing up data can be reduced.
Dependency aware improvements to support parallel replay or parallel replication of operations which are directed to a common node
Techniques are provided for dependency aware parallel splitting of operations. For example, a first operation and a second operation may be replicated in parallel from a first device to a second device if the operations only target a single common inode that is an access control list inode referenced by the operations. An operation that dereferences the access control list inode can be replicated in parallel with other operations if the operation does not have the potential to delete the access control list inode from the second device. In another example, operations may be replicated to the second device in parallel if the operations only affect a single common parent directory inode and where timestamps are only moved forward in time at the second device.
METHOD AND DEVICE FOR DATA SYNCHRONIZATION, STORAGE MEDIUM AND ELECTRONIC DEVICE
A method and device for data synchronization, a storage medium and an electronic device are provided. The method for data synchronization includes operations as follows. Synchronization configuration information for data to be synchronized is determined, and the synchronization configuration information at least includes a data identification of the data to be synchronized and a source data table identification of a source data table where the data to be synchronized is located. A source database is queried based on the source data table identification to obtain a target source data table where the data to be synchronized is located. A field identification of the data to be synchronized is determined from the target source data table based on the data identification. A target data table is constructed based on the field identification, and the data to be synchronized is synchronized into the target data table.
ANALYSIS OF STREAMING DATA USING DELTAS AND SNAPSHOTS
Implementations described herein relate to methods, systems, and computer-readable media to obtain snapshots used for analysis of streaming data. In some implementations, a computer-implemented method includes receiving initial data that includes a plurality of identifiers and corresponding timestamps, generating and storing a snapshot based on the initial data, wherein the snapshot includes the identifiers and a corresponding status, receiving a data stream that includes a subset of the identifiers, activity information for each identifier in the subset, and corresponding timestamps. The method further includes periodically analyzing the data stream to obtain a delta that includes an updated status for each identifier in the subset, storing the delta separate from the snapshot. The method further includes receiving a request for identifiers that are active in a particular time period, and based on the particular time period, retrieving active identifiers from the data stream, the delta, or the snapshot.
DE-DUPLICATION OF DATA IN EXECUTABLE FILES IN A CONTAINER IMAGE
Methods, systems, and computer program products for de-duplicating data in executable files in a container image are disclosed. The method may include receiving a request to read a file in a first layer in a container image including a plurality of layers, wherein the file is a delta file which is from an updated executable file based on a base executable file, the base executable file is in a lower layer than the first layer in the container image, and the delta file includes block mappings between the updated executable file and the base executable file and different data between the two files, and blocks included in the two files are based on respective file structure. The method may also include restoring the updated executable file based on the delta file and the base executable file. The method may further include returning data in the updated executable file.
Method for copying data, electronic device and computer program product
Techniques for replicating data involve: acquiring a first snapshot of a data block set, the first snapshot being a snapshot before a first subset of the data block set starts to be replicated; acquiring a second snapshot of the data block set, the second snapshot being a snapshot of the data block set when replication of the first subset is completed; and determining, based on a difference between the second snapshot and the first snapshot, a second subset of the data block set, the second subset being different from the first subset. Accordingly, such techniques can improve data protection efficiency in asynchronous replication.
Data transfer using snapshot differencing from edge system to core system
A source system generates snapshots of collected data. The snapshots have respective associated time references. Responsive to a request from a target system for data collected over a time interval, the source system generates a subset of the data collected by determining a start snapshot and an end snapshot. The start snapshot and the end snapshot are determined as a pair of snapshots that have respective associated time references that are most closely spaced and are inclusive of the time interval. The source system determines a difference in the data included in the end snapshot and the start snapshot and provides the subset of the data as the difference in the data included in the end snapshot and the start snapshot.
Maintaining high-availability of a file system instance in a cluster of computing nodes
A method for maintaining high-availability of file system instances is described. The method includes maintaining replica file system instances such as a first replica file system instance on a first computing node and a second replica file system instance on a second computing node. Further, a third computing node is instructed to create a sparse replica file system instance on the third computing node in response to detection of a failure condition associated with the second computing node. Moreover, a data update request is directed to the first replica file system instance and the sparse replica file system.
Backup objects for fully provisioned volumes with thin lists of chunk signatures
Examples may include backup objects for fully provisioned volumes with thin lists of chunk signatures. Examples may generate one or more full lists of chunk signatures for the address space of a fully provisioned volume, compare each chunk signature of the full list to an unused region chunk signature representing a chunk of an unused region of the fully provisioned volume, generate metadata to indicate used regions of the fully provisioned volume, based on the comparisons, and generate from the one or more full lists, one or more thin lists omitting all chunk signatures determined to match the unused region chunk signature.
Synchronized data deduplication
A system and method for data deduplication is presented. Data received from one or more computing systems is deduplicated, and the results of the deduplication process stored in a reference table. A representative subset of the reference table is shared among a plurality of systems that utilize the data deduplication repository. This representative subset of the reference table can be used by the computing systems to deduplicate data locally before it is sent to the repository for storage. Likewise, it can be used to allow deduplicated data to be returned from the repository to the computing systems. In some cases, the representative subset can be a proper subset wherein a portion of the referenced table is identified shared among the computing systems to reduce bandwidth requirements for reference-table synchronization.