Patent classifications
G06F11/1425
IMPLEMENTING AVAILABILITY DOMAIN AWARE REPLICATION POLICIES
Systems for distributed data storage. A method commences upon accessing a set of data items that describe computing nodes to be organized into a ring topology. The ring topology and distributed data storage policies are characterized by quantitative failure-resilient characteristics such as a replication factor. Various characteristics of the topology serve to bound two or more availability domains of the ring into which the computing nodes can be mapped. A set of quantitative values pertaining to respective quantitative failure-resilient characteristics are used for enumerating candidate ring topologies where the computing nodes are mapped into the availability domains. Using the quantitative failure-resilient characteristics, alternative candidate ring topologies are evaluated so as to determine a configuration score for candidate ring topologies. A candidate ring topology is configured based on a computed configuration score surpassing a threshold score. When a failure event is detected, the ring is reevaluated, remapped, and considered for reconfiguration.
Resource Processing Method and Device for Multi-controller System
A resource processing method and device for a multi-controller system are provided. The method includes that: when a controller in the multi-controller system may not sense existence of a peer controller, the controller judges whether the peer controller loads a first resource pool according to a first use tag stored in the first resource pool previously loaded by the peer controller.
Access point controller failover system
An access point IHS group controller failover system includes a first access point IHS group controller that controls a first access point IHS group that includes plurality of access point IHSs. Following a failure of the first access point IHS group controller, the first access point IHS broadcasts a first access point IHS identifier to a first subset of the plurality of access point IHSs. The first access point IHS then registers the first subset of the plurality of access point IHSs as members of a second access point IHS group, and controls at least some functions of the second access point IHS group. When the first access point IHS detects activity from the first access point IHS group controller, it instructs the first subset of the plurality of access point IHSs in the second access point IHS group to reconnect to the first access point IHS group controller.
ARBITRATION PROCESSING METHOD AFTER CLUSTER BRAIN SPLIT, QUORUM STORAGE APPARATUS, AND SYSTEM
The present disclosure discloses an arbitration processing solution when brain split occurs in cluster. The solution includes: receiving, by a quorum storage apparatus within a first refresh packet detection period, first master quorum node preemption requests sent by at least two quorum nodes in the cluster; sending, by the quorum storage apparatus, a first master quorum node preemption success response message to the initial master quorum node indicating that the initial master quorum node succeeds in master quorum node preemption when the first master quorum node preemption requests received within the first refresh packet detection period comprise the master quorum node preemption request sent by the initial master quorum node.
Load balancing and fault tolerant service in a distributed data system
Techniques for load balancing and fault tolerant service are described. An apparatus may comprise load balancing and fault tolerant component operative to execute a load balancing and fault tolerant service in a distributed data system. The load balancing and fault tolerant service distributes a load of a task to a first node in a cluster of nodes using a routing table. The load balancing and fault tolerant service stores information to indicate the first node from the cluster of nodes is assigned to perform the task. The load balancing and fault tolerant service detects a failure condition for the first node. The load balancing and fault tolerant service moves the task to a second node from the cluster of nodes to perform the task for the first node upon occurrence of the failure condition.
Self healing cluster of a content management system
Systems and methods herein provide for a clustered content management comprising at least two computing nodes. A first node comprises an instance of the content repository. The first computing node may perform content management operations on its instance of the content repository. Changes to the instance of the content repository of the first computing node are synchronized with the content repository by way of a second computing node. The second computing node is communicatively coupled to the first computing node through a network and is operable to synchronize the change with the content repository. The second computing node also determines that synchronization of the change is blocked due to an error. The second computing node identifies the error, determines that the error is correctable, and corrects the error to synchronize the change with the content repository.
HEARTBEAT-BASED DATA SYNCHRONIZATION APPARATUS AND METHOD, AND DISTRIBUTED STORAGE SYSTEM
A heartbeat-based data synchronization method is disclosed. The method is applied to a distributed storage system, and at least one data block group is stored in the distributed storage system. The distributed storage system includes multiple storage devices, one device in the multiple storage devices is a primary device for storing the data block group, and other devices are secondary devices for storing the data block group. The primary device performs the method. The primary device obtains access status information of the data block group, determines a heartbeat time of the data block group according to the access status information of the data block group, and sends a data synchronization instruction to the secondary device according to the heartbeat time of the data block group, where the data synchronization instruction is used to instruct the secondary device to synchronize data.
Method, management node and processing node for continuous availability in cloud environment
Method, management node and processing node are disclosed for continuous availability in a cloud environment. According to an embodiment, the cloud environment comprises a plurality of layers and each layer includes at least two processing nodes. Each processing node in a layer can pull job(s) from the processing nodes in the upper layer if any and prepare job(s) for the processing nodes in the under layer if any. A method implemented at a management node comprises receiving measurement reports from the plurality of layers. The measurement report of each processing node comprises information about job(s) pulled from the upper layer if any and job(s) pulled by the under layer if any. The method further comprises determining information about failure in the cloud environment based on the measurement reports.
Dynamically erectable computer system
A fault-tolerant computer system architecture includes two types of operating domains: a conventional first domain (DID) that processes data and instructions, and a novel second domain (MM domain) which includes mentor processors for mentoring the DID according to “meta information” which includes but is not limited to data, algorithms and protective rule sets. The term “mentoring” (as defined herein below) refers to, among other things, applying and using meta information to enforce rule sets and/or dynamically erecting abstractions and virtualizations by which resources in the DID are shuffled around for, inter alia, efficiency and fault correction. Meta Mentor processors create systems and sub-systems by means of fault tolerant mentor switches that route signals to and from hardware and software entities. The systems and sub-systems created are distinct sub-architectures and unique configurations that may be operated as separately or concurrently as defined by the executing processes.
Computer cluster with adaptive quorum rules
The fail-over computer cluster enables multiple computing devices to operate using adaptive quorum rules to dictate which nodes are in the fail-over cluster at any given time. The adaptive quorum rules provide requirements for communications between nodes and connections with voting file systems. The adaptive quorum rules include particular recovery rules for unplanned changes in node configuration, such as due to a disruptive event. Such recovery quorum rules enable the fail-over cluster to continuing to operate with various changed configurations of its node members as a result of the disruptive event. In the changed configuration, access to voting file systems may not be required for a majority-group subset of nodes. If no majority-group subset remains, nodes may need direct or indirect access to voting file systems.