Patent classifications
G06F11/1453
Data masking in a microservice architecture
A method includes retrieving an object from storage and copying the object, generating a list that identifies one or more byte ranges, of the copy of the object, to be masked, providing the list to a masker controller microservice that examines a recipe corresponding to the copy of the object, where the recipe references a slice of the copy of the object, and the slice includes one or more data segments, masking, by the masker controller microservice, a segment of the slice that is in one of the byte ranges, to create a masked segment, and replacing, in the slice, the segment with the masked segment, to create a masked slice and creating a masked object recipe that contains a reference to the masked slice, creating a masked object that includes the masked slice, and that references any unmasked segments of the slice, and deduplicating the masked object.
Garbage collection for a deduplicated cloud tier using functions
Systems and methods for performing data protection operations including garbage collection operations and copy forward operations. For deduplicated data stored in a cloud-based storage or in a cloud tier that stores containers containing dead and live segments or dead and live regions such as compression regions, the dead compression regions are deleted by copying the live compression regions into new containers and then deleting the old containers. The copy forward is based on a recipe from a data protection system and is performed using a serverless approach.
Automatic storage target recommendations using a storage classifier
Embodiments for a storage classifier that provides recommendations to a backup server for storage targets among a plurality of disparate target storage types. The storage classifier receives metadata (name, type, size), and the Service Level Agreement with information such as: retention time, Recovery Point Objective, and Recovery Time Objective) from the backup software. The backup software itself receives policy recommendations from a data label rules engine based on certain file attributes. The storage classifier receives an initial recommendation for the storage type and location (e.g., on-premises deduplication storage or public-cloud object storage, etc.) from a data classifier. Based on these inputs, the storage classifier provides recommended specific storage targets to the backup software on a file-by-file basis for data stored in a backup operation.
Technologies for providing shared memory for accelerator sleds
Technologies for providing shared memory for accelerator sleds includes an accelerator sled to receive, with a memory controller, a memory access request from an accelerator device to access a region of memory. The request is to identify the region of memory with a logical address. Additionally, the accelerator sled is to determine from a map of logical addresses and associated physical address, the physical address associated with the region of memory. In addition, the accelerator sled is to route the memory access request to a memory device associated with the determined physical address.
Source file copying and error handling
Object service receives request to copy file to destination and identifies group identifier for fingerprints group corresponding to sequential segments in file. Object service communicates request for fingerprints group to deduplication service associated with group identifier range including group identifier. Deduplication service communicates fingerprints group, retrieved from fingerprint storage, to object service, which communicates fingerprints group and group identifier to destination. Object service communicates request for file segments, corresponding to fingerprints missing in destination, communicated from destination, to deduplication service, which communicates requested segments, retrieved from source storage, to object service, which communicates requested segments to destination. System identifies generation identifier associated with time of communicating by object service or deduplication service, and generation identifier associated with another time of communicating by object service or deduplication service. If generation identifier associated with time differs from generation identifier associated with other time, object service or deduplication service restarts communication.
Destination file copying and error handling
Object service receives communication of fingerprints stream, corresponding to file segments, from file source, and identifies sequential fingerprints in fingerprints stream as fingerprints group. Object service identifies group identifier for fingerprints group, and communicates fingerprints group to deduplication service associated with group identifier range including group identifier. Deduplication service identifies fingerprints in fingerprints group which are missing from fingerprint storage, and communicates identified fingerprints to object service, which communicates request for file segments, corresponding to identified fingerprints, to file source. Deduplication service receives communication of requested segments from file source, and stores requested segments. System identifies generation identifier associated with time of communicating by object service or deduplication service and identifies generation identifier associated with another time of communicating by object service or deduplication service. If generation identifier associated with time differs from generation identifier associated with other time, object service or deduplication service restarts communication.
Update of deduplication fingerprint index in a cache memory
In some examples, a system performs data deduplication using a deduplication fingerprint index in a hash data structure comprising a plurality of blocks, wherein a block of the plurality of blocks comprises fingerprints computed based on content of respective data values. The system merges, in a merge operation, updates for the deduplication fingerprint index to the hash data structure stored in a persistent storage. As part of the merge operation, the system mirrors the updates to a cached copy of the hash data structure in a cache memory, and updates, in an indirect block, information regarding locations of blocks in the cached copy of the hash data structure.
Systems and methods for managing single instancing data
Described in detail herein are systems and methods for managing single instancing data. Using a single instance database and other constructs (e.g. sparse files), data density on archival media (e.g. magnetic tape) is improved, and the number of files per storage operation is reduced. According to one aspect of a method for managing single instancing data, for each storage operation, a chunk folder is created on a storage device that stores single instancing data. The chunk folder contains three files: 1) a file that contains data objects that have been single instanced; 2) a file that contains data objects that have not been eligible for single instancing; and 3) a metadata file used to track the location of data objects within the other files. A second storage operation subsequent to a first storage operation contains references to data objects in the chunk folder created by the first storage operation instead of the data objects themselves.
Load balancing across multiple data paths
Multiple data paths may be available to a data management system for transferring data between a primary storage device and a secondary storage device. The data management system may be able to gain operational advantages by performing load balancing across the multiple data paths. The system may use application layer characteristics of the data for transferring from a primary storage to a backup storage during data backup operation, and correspondingly from a secondary or backup storage system to a primary storage system during restoration.
SYSTEMS AND METHODS FOR MANAGEMENT OF VIRTUALIZATION DATA
Described in detail herein is a method of copying data of one or more virtual machines being hosted by one or more non-virtual machines. The method includes receiving an indication that specifies how to perform a copy of data of one or more virtual machines hosted by one or more virtual machine hosts. The method may include determining whether the one or more virtual machines are managed by a virtual machine manager that manages or facilitates management of the virtual machines. If so, the virtual machine manager is dynamically queried to automatically determine the virtual machines that it manages or that it facilitates management of. If not, a virtual machine host is dynamically queried to automatically determine the virtual machines that it hosts. The data of each virtual machine is then copied according to the specifications of the received indication.