G06F11/2033

VIRTUALIZED FILE SERVER DISASTER RECOVERY

In one embodiment, a system for managing a virtualization environment includes a set of host machines, each of which includes a hypervisor, virtual machines, and a virtual machine controller, and a virtualized file server backup system configured to identify backup data, wherein the backup data comprises data stored on the virtual disks and VFS configuration information, and the first data is identified in accordance with a backup policy, send the backup data to one or more remote sites for storage, and, in response to detection of changes in the backup data, send the changes to the remote sites in accordance with a replication policy. The backup data may be identified based on a protection domain associated with the backup policy. The data stored on the VFS may include one or more storage objects. The storage objects may include shares, groups of shares, files, or directories.

System and method for handling multi-node failures in a disaster recovery cluster

A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.

NETWORK TRANSCEIVER WITH AUTO-NEGOTIATION HANDLING
20170315887 · 2017-11-02 ·

A rework re-timer with forward error correction handling is disclosed. An example first intermediate transceiver includes a first interface to communicatively couple the first intermediate transceiver with a first computing device, a second interface to communicatively couple the first intermediate transceiver to a second intermediate transceiver configured to be communicatively coupled with a second computing device, an auto-negotiation controller to: terminate a first auto-negotiation with the first computing device before the first auto-negotiation is completed, transmit, to the second transceiver, first capabilities of the first computing device determined during the first auto-negotiation, and perform a second auto-negotiation with the first computing device utilizing the first capabilities of the first computing device and second capabilities of the second computing device received from the second transceiver.

Application transparent continuous availability using synchronous replication across data stores in a failover cluster

Disclosed herein is a system and method for automatically moving an application from one site to another site in the event of a disaster. Prior to coming back online the application is configured with information to allow it to run on the new site without having to perform the configuration actions after the application has come online. This enables a seamless experience to the user of the application while also reducing the associated downtime for the application.

METHOD AND SYSTEM FOR RECONSTRUCTING A SLOT TABLE FOR NFS BASED DISTRIBUTED FILE SYSTEMS
20170310750 · 2017-10-26 ·

A method and a system for reconstructing a slot table for Network File System (NFS) based distributed file systems are provided herein. The method includes: receiving a retried request from a client at a node of the distributed file system; in a case that the retried request is of a re-enter idempotent type, processing the request again; in a case that the retried request is file state related, checking in already opened file handles if they are open with exactly same properties already exist for the particular client, and if found, returning the file handle information to the client as if it was just opened by it; and in a case that the retried request is of a non-idempotent type attempting to perform the operation again, wherein if the source file does not exist, checking the existence of the expected outcome, and replying with a success.

Self-healing virtualized file server

In one embodiment, a system for managing a virtualization environment comprises a plurality of host machines, one or more virtual disks comprising a plurality of storage devices, a virtualized file server (VFS) comprising a plurality of file server virtual machines (FSVMs), wherein each of the FSVMs is running on one of the host machines and conducts I/O transactions with the one or more virtual disks, and a virtualized file server self-healing system configured to identify one or more corrupt units of stored data at one or more levels of a storage hierarchy associated with the storage devices, wherein the levels comprise one or more of file level, filesystem level, and storage level, and when data corruption is detected, cause each FSVM on which at least a portion of the unit of stored data is located to recover the unit of stored data.

System, and control method and program for input/output requests for storage systems

Virtual first logical volumes are provided to a host, a virtual second logical volume correlated with any one of the first logical volumes is created in a storage node in correlation with a storage control module disposed in the storage node, a correspondence relationship between the first and second logical volumes is managed as mapping information, a storage node which is an assigning distribution of an I/O request is specified on the basis of the mapping information in a case where the I/O request in which the first logical volume is designated as an I/O destination is given from the host, the I/O request is assigned to the storage control module of its own node in a case where the specified storage node is its own node, and the I/O request is assigned to another storage node in a case where the specified storage node is another storage node.

Server system, computer system, method for managing server system, and computer-readable storage medium
09792189 · 2017-10-17 · ·

In a server system, a hardware configuration comparison is made with respect to each combination of a current server and a backup server, and, by referring to hardware configuration matching policy information, the presence or absence of hardware configuration concealment and the possibility of a take-over are determined with respect to each combination of the current server and the backup server. In addition, with respect to each combination of the current server and the backup server, a configuration matching rate indicating the ratio of hardware configuration matching is calculated. Based on information about the presence or absence of hardware configuration concealment, information about the possibility of a take-over, and information about the configuration matching rate with respect to each combination of the current server and the backup server, the backup server as a take-over destination of the current server is allocated.

FAILOVER OF APPLICATION SERVICES
20170293540 · 2017-10-12 ·

The disclosure is directed to a failover mechanism for failing over an application service, e.g., a messaging service, from servers in a first region to servers in a second region. Data is stored as shards in which each shard contains data associated with a subset of the users. Data access requests are served by a primary region of the shard. A global shard manager manages failing over the application service from a current primary region to a secondary region of the shard. A leader service in the application service replicates data associated with the application service from the primary to the secondary region, and ensures that the state of various other services of the application service in the secondary region is consistent. The leader service confirms that there is no replication lag between the primary and secondary regions and fails over the application service to the secondary region.

Adaptive datacenter topology for distributed frameworks job control through network awareness

Systems, methods, and computer program products to perform an operation comprising receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job, and selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that the mapper node is not affected by an error, and a resource utilization score for the mapper node does not exceed a utilization threshold.