G06F11/2082

Common server san core solution to enable software defined storage

In an aspect of the disclosure, a method, a computer-readable medium, and a computer system are provided. The computer system includes a baseboard management controller (BMC). The BMC receives a first message from a first remote device on a management network. The BMC determines whether the first message is directed to a storage service or fabric service running on a host of the BMC. The host is a storage device. The BMC extracts a service management command from the first message, when the first message is directed to the storage service or fabric service. The BMC sends, through a BMC communication channel to the host, a second message containing the service management command to the host. The BMC communication channel established for communicating baseboard management commands between the BMC and the host.

OPTIMIZED RECOVERY IN DATA REPLICATION ENVIRONMENTS

A method for optimizing recovery in a data replication environment is disclosed. In one embodiment, such a method includes directing I/O from a primary site to a secondary site in response to a failure at the primary site. After the primary site has recovered from the failure, the method initiates a recovery process wherein updated data elements at the secondary site are copied to the primary site. The method determines a recorded average I/O latency for a host system driving I/O to the secondary site, and calculates an expected average I/O latency for the host system driving I/O to the primary site. The method redirects I/O from the secondary site to the primary site when a difference between the expected average I/O latency and the recorded average I/O latency reaches a threshold value. A corresponding system and computer program product are also disclosed.

Systems and methods for adding active volumes to existing replication configurations
09836515 · 2017-12-05 · ·

A computer-implemented method for adding active volumes to existing replication configurations may include (1) identifying a new volume to be added to an existing replication configuration that replicates a plurality of volumes to a remote storage device, (2) using interchangeable bitmaps to perform an initial synchronization of the new volume with the remote storage device before replicating the new volume to the remote storage device as part of the existing replication configuration, (3) determining that a replication log associated with the replication configuration is capable of flagging future writes by the application to the new volume without overflowing, and, upon making that determination, (4) replicating the new volume to the remote storage device as part of the existing replication configuration. Various other methods, systems, and computer-readable media are also disclosed.

TECHNIQUE FOR EFFICIENT DATA FAILOVER IN A MULTI-SITE DATA REPLICATION ENVIRONMENT
20220374316 · 2022-11-24 ·

A technique provides efficient data failover by creation and deployment of a protection policy that ensures maintenance of frequent common snapshots between sites of a multi-site data replication environment. A global constraint optimizer executes on a node of a cluster to create the protection policy for deployment among other nodes of clusters at the sites. Constraints such as protection rules (PRs) specifying, e.g., an amount of tolerable data loss are applied to a category of data designated for failover from a primary site over a network to a plurality of (secondary and tertiary) sites typically located at geographically separated distances. The optimizer processes the PRs to compute parameters such as frequency of snapshot generation and replication among the sites, as well as retention of the latest common snapshot maintained at each site to create a recovery point and configuration of the protection policy that reduces network traffic for efficient use of the network among the sites.

Switching between fault response models in a storage system

A storage system switching between mediation models within a storage system, where the switching between mediation models includes: determining, among one or more of the plurality of storage systems, a change in availability of a mediator service, wherein one or more of the plurality of storage systems are configured to request mediation from the mediator service in response to a fault; and communicating, among the plurality of storage systems and responsive to determining the change in availability of the mediator service, a fault response model to be used as an alternate to the mediator service among one or more of the plurality of storage systems.

METHODS FOR FLEXIBLE DATA-MIRRORING TO IMPROVE STORAGE PERFORMANCE DURING MOBILITY EVENTS AND DEVICES THEREOF

A method, device, and non-transitory computer readable medium for minoring data, comprising, selecting, based on a plurality of data attributes, a portion of local data in a local storage device for minoring to a remote storage device and copying the selected portion of the local data to at least one cache memory of the remote storage device. Next a determination of when a failover event has occurred in the local storage device is made, wherein the failover event comprises an event in which the local data in the local storage device is inaccessible to a client computing device when the client computing device attempts to access the local data from the local storage device. A copy of the local data from the cache memory in the remote storage device is retrieved when the failover event is determined to have occurred.

Data recovery in multi-leader distributed systems

Disclosed are a method and system for recovering a distributed system from a failure of a data storage unit. The distributed system includes a plurality of computer systems, each having a read-write computer and a data storage unit. Data is replicated from a particular data storage unit to other data storage units using publish-subscribe model. A read-write computer receives the replicated data, processes the data for any conflicts and stores it in the data storage unit. If a data storage unit fails, another data storage unit that has latest data corresponding to the failed data storage unit is determined and the latest data is replicated to other data storage units. Accordingly, the distributed system continues to have the data of the failed data storage unit. The failed data storage unit may be reconstructed using data from one of the other data storage units in the distributed system.

Issuing operations directed to synchronously replicated data

Managing connectivity to synchronously replicated storage systems, including: identifying a plurality of storage systems across which a dataset is synchronously replicated; identifying a host that can issue I/O operations directed to the dataset; identifying a plurality of data communications paths between the host and the plurality of storage systems across which a dataset is synchronously replicated; identifying, from amongst the plurality of data communications paths between the host and the plurality of storage systems across which a dataset is synchronously replicated, one or more optimal paths; and issuing, to the host, an identification of the one or more optimal paths.

Label propagation in a distributed system

Data are maintained in a distributed computing system that describe a graph. The graph represents relationships among items. The graph has a plurality of vertices that represent the items and a plurality of edges connecting the plurality of vertices. At least one vertex of the plurality of vertices includes a set of label values indicating the at least one vertex's strength of association with a label from a set of labels. The set of labels describe possible characteristics of an item represented by the at least one vertex. At least one edge of the plurality of edges includes a set of label weights for influencing label values that traverse the at least one edge. A label propagation algorithm is executed for a plurality of the vertices in the graph in parallel for a series of synchronized iterations to propagate labels through the graph.

Synchronous replication

One or more techniques and/or computing devices are provided for synchronous replication. For example, synchronous replication relationships are established between a first storage object (e.g., a file, a logical unit number (LUN), a consistency group, etc.), hosted by a first storage controller, and a plurality of replication storage objects hosted by other storage controllers. In this way, a write operation to the first storage object is implemented in parallel upon the first storage object and the replication storage objects in a synchronous manner, such as using a zero-copy operation to reduce overhead otherwise introduced by performing copy operations. Reconciliation is performed in response to a failure so that the first storage object and the replication storage objects comprise consistent data. Failed write operations and replication write operations are retried, while enforcing a single write semantic. Dependent write order consistency is enforced for dependent write operations, such as overlapping write operations.