G06F11/1466

Managing updates and copying data in a point-in-time copy relationship expressed as source logical addresses and target logical addresses

Provided are a computer program product, system, and method for managing updates and copying data in a point-in-time copy relationship expressed as source logical addresses and target logical addresses. A copy relationship indicates a source set of a subset of source logical addresses to copy to a target set comprising a subset of target logical addresses. An update is received to a source logical address that has not been copied. Determinations are made of the target logical address corresponding to the source logical address to be updated according to the copy relationship, a target group of target logical addresses in the target set that include the determined target logical address, and the source logical addresses in the source set that correspond to the target logical addresses in the target group. The determined source logical addresses are copied to the target logical addresses in the determined target group.

Ephemeral remote data store for dual-queue systems

A computer-implemented method, system, and computer-readable media are disclosed herein. In embodiments, the computer-implemented method may entail receiving, by a data service, live data associated with an entity. The entity may be, for example, a customer of the data service. The method may then route the live data to a dual-queue system. The live data may then be loaded into a live data queue for processing of the live data. In addition, the live data may be stored as a persistent backup of the live data in a stale data queue. A remote data store may periodically establish a connection with the dual-queue system, after which, at least a portion of the stale data may be transmitted to the remote data store. Additional embodiments are described and/or claimed.

Dynamically meeting slas without provisioning static capacity
09838332 · 2017-12-05 · ·

In one example, a method for identifying and allocating resources in a computing system, including checking, while one or more backup processes are running, database connections in an auto scaling group to determine if a number of database connections in use in connection with the backup processes has decreased since a prior check was performed. When the number of database connections in use has decreased, an identification is made as to which of a plurality of queues each respectively associated with one of the backup processes has the greatest need for additional database connections. Next, various metrics are evaluated and, based on the evaluation of the metrics, one or more available database connections are assigned to the queue with the greatest need for additional database connections.

METHODS AND SYSTEMS FOR A NON-DISRUPTIVE PLANNED FAILOVER FROM A PRIMARY COPY OF DATA AT A PRIMARY STORAGE SYSTEM TO A MIRROR COPY OF THE DATA AT A CROSS-SITE SECONDARY STORAGE SYSTEM WITHOUT USING AN EXTERNAL MEDIATOR
20220374321 · 2022-11-24 · ·

Systems and methods are described for a non-disruptive planned failover from a primary copy of data at a primary storage cluster to a mirror copy of the data at a cross-site secondary storage cluster without using an external mediator. According to an example, a planned failover feature of a multi-site distributed storage system provides an order of operations such that a primary copy of a first data center continues to serve I/O operations until a mirror copy of a second data center is ready. This planned failover feature improves functionality and efficiency of the distributed storage system by providing non-disruptiveness during planned failover without using an external mediator based on a primary storage cluster being selected as an authority to implement a state machine with a persistent configuration database to track a planned failover state for the planned failover.

METHOD AND APPARATUS FOR DATA CHECKPOINTING AND RESTORATION IN A STORAGE DEVICE
20170344430 · 2017-11-30 · ·

In one embodiment, an apparatus comprises a storage device to store a reference namespace comprising a plurality of logical blocks that correspond to physical blocks of a memory, to receive a request to create a first snapshot namespace based on the reference namespace, and to initialize a plurality of logical blocks of the first snapshot namespace to map to corresponding logical blocks of the reference namespace.

AUTOMATIC AND CUSTOMISABLE CHECKPOINTING
20170344564 · 2017-11-30 · ·

A checkpointing mechanism by which in-memory data structures are copied from computation nodes (200) to staging nodes (700) by using RDMA, checkpoints are made and kept in memory in the staging node (700), and then asynchronously copied to non-volatile storage (150). In contrast to previous approaches, checkpoints remain in volatile memory (740) as part of the checkpointing mechanism. As a result, recovery from checkpoint is potentially faster, since the required checkpoint may be already in memory (740) in the staging node (700). An automatic and customisable mechanism is provided to control when the checkpointing process is triggered. As an alternative to copying an object through the network, the object in memory can be updated to a newer version of the object by applying the chain of changes made in the object in the corresponding computation node (200).

Preserving tolerance to storage device failures in a storage system

Ensuring resiliency to storage device failures in a storage system, including: determining a number of storage device failures within a particular write group that are to be tolerated by the storage system; for a plurality of datasets stored within the storage system, writing each dataset to at least a predetermined number of storage devices within the particular write group, wherein the predetermined number of storage devices is greater than the number of storage device failures within the particular write group that are to be tolerated by the storage system; and responsive to recovering from a system interruption: determining a number of readable storage devices that contain a copy of the dataset; and if the number of readable storage devices that contain a copy of the dataset is not greater than the number of failures that are to be tolerated, writing the dataset to one or more additional storage devices.

Method, device and computer program product for job management

Embodiments of the present disclosure relate to method, device and computer program product for job management. The method comprises: obtaining an execution plan associated with a plurality of backup jobs including a target backup job, the execution plan at least indicating a size of backup data and start times of the plurality of backup jobs; determining, based on the execution plan, a first set of backup jobs to be executed in parallel at a start time of the target backup job; determining a predicted backup speed of executing the first set of backup jobs in parallel at the start time of the target backup job; and determining, at least based on the predicted backup speed and the size of the backup data of the target backup job, time required for executing the target backup job. Accordingly, the time required for executing the backup jobs can be more accurately predicted.

REAL TIME DATABASE BACKUP STATUS INDICATION AND RESTORE

A computer-implemented method at a data management system comprises; retrieving start and end times of a backup of a database; retrieving time stamps of log backups of the database; retrieving sequence numbers of the log backups; generating a graphical user interface illustrating a timeline of availability of database restoration and unavailability; making a second backup of the database; illustrating, on the graphical user interface during the making, pending availability of the second database backup; receiving a command to restore the database at an available time as illustrated by the graphical user interface; and restoring the database.

Predicting backup failures due to exceeding the backup window

Exemplary methods for predicting backup and restore failure include analyzing, at a management server, resource utilization statistics periodically collected during backup of data from a source storage system to a target storage system. In one embodiment, the methods include creating a predictive model based on the analysis of the collected resource utilization statistics. In one embodiment, the method includes predicting, using the predictive model, whether a backup time or a restore time of future backup will exceed a backup time threshold or restore time threshold, respectively.