G06F11/2046

Disaster recovery systems and methods with low recovery point objectives
11579987 · 2023-02-14 · ·

Data recovery systems and methods utilize object-based storage for providing a data protection and recovery methodology with low recovery point objectives, and for enabling both full recovery and point-in-time based recovery. Data generated at a protected site (e.g., via one or more virtual machines) is intercepted during write procedures to primary storage. The intercepted data is replicated via a replication log, provided as data objects, and transmitted to an object based storage system. During recovery, data objects may be retrieved through point-in-time based recovery directly by the systems of the protected site, and/or data objects may be provided via full recovery, for example, within a runtime environment of a recovery site, with minimal data loss and operation interruption by rehydrating data objects within the runtime environment via low-latency data transfer and rehydration systems.

Virtualized file server smart data ingestion

In one embodiment, a system for managing a virtualization environment includes a set of host machines, each of which includes a hypervisor, virtual machines, and a virtual machine controller, and a data migration system configured to identify one or more existing storage items stored at one or more existing File Server Virtual Machines (FSVMs) of an existing virtualized file server (VFS). For each of the existing storage items, the data migration system is configured to identify a new FSVMs of a new VFS based on the existing FSVM, send a representation of the storage item from the existing FSVM to the new FSVM, such that representations of storage items are sent between different pairs of FSVMs in parallel, and store a new storage item at the new FSVM, such that the new storage item is based on the representation of the existing storage item received by the new FSVM.

Dynamic allocation of compute resources at a recovery site

Examples of systems are described herein which may dynamically allocate compute resources to recovery clusters. Accordingly, a recovery site may utilize fewer compute resources in maintaining recovery clusters for multiple associate clusters, while ensuring that, during use, compute resources are allocated to a particular cluster. This may reduce and/or avoid vulnerabilities arising from a use of shared resources in a virtualized and/or cloud environment.

Virtualized file server

In one embodiment, a system for managing communication connections in a virtualization environment includes a plurality of host machines implementing a virtualization environment, wherein each of the host machines includes a hypervisor, at least one user virtual machine (user VM), and a distributed file server that includes file server virtual machines (FSVMs) and associated local storage devices. Each FSVM and associated local storage device are local to a corresponding one of the host machines, and the FSVMs conduct I/O transactions with their associated local storage devices based on I/O requests received from the user VMs. Each of the user VMs on each host machine sends each of its respective I/O requests to an FSVM that is selected by one or more of the FSVMs for each I/O request based on a lookup table that maps a storage item referenced by the I/O request to the selected one of the FSVMs.

Front End Traffic Handling In Modular Switched Fabric Based Data Storage Systems

Systems, methods, apparatuses, and software for data storage systems are provided herein. In one example, a data storage system is provided that includes storage drives each comprising a PCIe interface, and configured to store data and retrieve the data stored on associated storage media responsive to data transactions received over a switched PCIe fabric. The data storage system includes processors configured to each manage only an associated subset of the storage drives over the switched PCIe fabric. A first processor is configured to identify first data packets received over a network interface associated with the first processor within a network buffer of the first processor as comprising a storage operation associated with at least one of the plurality of storage drives managed by a second processor, and responsively transfer the first data packets into a network buffer of the second processor.

DYNAMIC ALLOCATION OF COMPUTE RESOURCES AT A RECOVERY SITE

Examples of systems are described herein which may dynamically allocate compute resources to recovery clusters. Accordingly, a recovery site may utilize fewer compute resources in maintaining recovery clusters for multiple associate clusters, while ensuring that, during use, compute resources are allocated to a particular cluster. This may reduce and/or avoid vulnerabilities arising from a use of shared resources in a virtualized and/or cloud environment.

Access consistency in high-availability databases
11704182 · 2023-07-18 · ·

Techniques are disclosed relating to maintaining a high availability (HA) database. In some embodiments, a computer system receives, from a plurality of host computers, a plurality of requests to access data stored in a database implemented using a plurality of clusters. In some embodiments, the computer system responds to the plurality of requests by accessing data stored in an active cluster. The computer system may then determine, based on the responding, health information for ones of the plurality of clusters, wherein the health information is generated based on real-time traffic for the database. In some embodiments, the computer system determines, based on the health information, whether to switch from accessing the active cluster to accessing a backup cluster. In some embodiments, the computer system stores, in respective clusters of the database, a changeover decision generated based on the determining.

MIGRATION OF VIRTUAL COMPUTE INSTANCES USING REMOTE DIRECT MEMORY ACCESS

A virtual compute instance is migrated between hosts using remote direct memory access (RDMA). The hosts are equipped with RDMA-enabled network interface controllers for carrying out RDMA operations between them. Upon failure of a first host and copying of page tables of the virtual compute instance to the first host's memory, a first RDMA operation is performed to transfer the page tables from the first host's memory to the second host's memory. Then, second RDMA operations are performed to transfer data pages of the virtual compute instance from the first host's memory to the second host's memory, with references to memory locations of the data pages specified in the page tables. The page tables of the virtual compute instance are reconstructed to reference memory locations of the data pages in the second host's memory and stored therein.

Automated discovery of databases

A networked computing system comprises a backup node cluster of a backup service in communication with a host database node cluster of a host, a host database at least initially undiscovered by the backup node cluster, one or more processors coupled with memory storing instructions that, when executed, perform operations comprising at least installing a backup agent on at least one node of the host database node cluster, registering the host at the backup service, based on the host registration, triggering a host database discovery process to discover the undiscovered database automatically, the discovery process including a discovery call, in response to the discovery call, receiving metadata relating to the discovered database, and communicating with the discovered database.

Adaptive multipath fabric for balanced performance and high availability

A computing system providing high-availability access to computing resources includes: a plurality of interfaces; a plurality of sets of computing resources, each of the sets of computing resources including a plurality of computing resources; and at least three switches, each of the switches being connected to a corresponding one of the interfaces via a host link and being connected to a corresponding one of the sets of computing resources via a plurality of resource connections, each of the switches being configured such that data traffic is distributed to remaining ones of the switches through a plurality of cross-connections between the switches if one of the switches fails.