G06F11/2035

Virtualized file server smart data ingestion

In one embodiment, a system for managing a virtualization environment includes a set of host machines, each of which includes a hypervisor, virtual machines, and a virtual machine controller, and a data migration system configured to identify one or more existing storage items stored at one or more existing File Server Virtual Machines (FSVMs) of an existing virtualized file server (VFS). For each of the existing storage items, the data migration system is configured to identify a new FSVMs of a new VFS based on the existing FSVM, send a representation of the storage item from the existing FSVM to the new FSVM, such that representations of storage items are sent between different pairs of FSVMs in parallel, and store a new storage item at the new FSVM, such that the new storage item is based on the representation of the existing storage item received by the new FSVM.

Cross cluster replication
11580133 · 2023-02-14 · ·

Methods and systems for cross cluster replication are provided. Exemplary methods include: periodically requesting by a follower cluster history from a leader cluster, the history including at least one operation and sequence number pair, the operation having changed data in a primary shard of the leader cluster; receiving history and a first global checkpoint from the leader cluster; when a difference between the first global checkpoint and a second global checkpoint exceeds a user-defined value, concurrently making multiple additional requests for history from the leader cluster; and when a difference between the first global checkpoint and the second global checkpoint is less than a user-defined value, executing the at least one operation, the at least one operation changing data in a primary shard of the follower cluster, such that an index of the follower cluster replicates an index of the leader cluster.

Methods and systems for rapid failure recovery for a distributed storage system
11579992 · 2023-02-14 · ·

Methods and systems are provided for rapid failure recovery for a distributed storage system for failures by one or more nodes.

Predicting and managing requests for computing resources or other resources

Requests for computing resources and other resources can be predicted and managed. For example, a system can determine a baseline prediction indicating a number of requests for an object over a future time-period. The system can then execute a first model to generate a first set of values based on seasonality in the baseline prediction, a second model to generate a second set of values based on short-term trends in the baseline prediction, and a third model to generate a third set of values based on the baseline prediction. The system can select a most accurate model from among the three models and generate an output prediction by applying the set of values output by the most accurate model to the baseline prediction. Based on the output prediction, the system can cause an adjustment to be made to a provisioning process for the object.

Virtualized file server

In one embodiment, a system for managing communication connections in a virtualization environment includes a plurality of host machines implementing a virtualization environment, wherein each of the host machines includes a hypervisor, at least one user virtual machine (user VM), and a distributed file server that includes file server virtual machines (FSVMs) and associated local storage devices. Each FSVM and associated local storage device are local to a corresponding one of the host machines, and the FSVMs conduct I/O transactions with their associated local storage devices based on I/O requests received from the user VMs. Each of the user VMs on each host machine sends each of its respective I/O requests to an FSVM that is selected by one or more of the FSVMs for each I/O request based on a lookup table that maps a storage item referenced by the I/O request to the selected one of the FSVMs.

Systems and methods for managing a highly available distributed hybrid transactional and analytical database

Systems and methods for managing a highly available distributed hybrid database comprising: a memory storing instructions; and one or more processors configured to execute the instructions to: receive a query from a user device to retrieve data from a distributed database comprising a source node, a first plurality of replica nodes, and a second plurality of replica nodes, wherein the source node and the first plurality of replica nodes form a transactional cluster, and wherein the second plurality of replica nodes forms an analytical cluster; determine whether to process the query using the transactional cluster or the analytical cluster based on one or more rules; translate the query into a first protocol that the determined cluster comprehends; select a replica node corresponding to the determined cluster; process the query using the selected replica node; and send data associated with results from processing the query to the user device.

Database system

The present disclosure relates to a method of operating a database system. The database system comprises: a database; a first compute node comprising a first database proxy; and a second compute node comprising a second database proxy. The method comprises receiving and processing, at the first database proxy, a first plurality of access requests to access the database; receiving and processing, at the second database proxy, a second plurality of database access requests to access the database; monitoring for a failure event associated with the first database proxy; and, in response to the monitoring indicating a failure event, initiating a failover procedure between the first database proxy and the second database proxy. The failover procedure comprises: redirecting the first plurality of access requests to the second database proxy; and processing, at the second database proxy, the first plurality of access requests.

Dynamic, distributed, and scalable single endpoint solution for a service in cloud platform
11706162 · 2023-07-18 · ·

A first forwarding VM may execute in a first availability zone and have a first IP address. Similarly, a second forwarding VM may execute in a second availability zone and have a second IP address. The first and second IP addresses may be recorded with a cloud DNS web service of a cloud provider such that both receive requests from applications directed to a particular DNS name acting as a single endpoint. A service cluster may include a master VM node and a standby VM node. An IPtable in each forwarding VM may forward a request having a port value to a cluster port value associated with the master VM node. Upon a failure of the master VM node, the current standby VM node may be promoted to execute in master mode and the IPtables may be updated to now forward requests having the port value to a cluster port value associated with the newly promoted master VM node (which was previously the standby VM node).

MIGRATION OF VIRTUAL COMPUTE INSTANCES USING REMOTE DIRECT MEMORY ACCESS

A virtual compute instance is migrated between hosts using remote direct memory access (RDMA). The hosts are equipped with RDMA-enabled network interface controllers for carrying out RDMA operations between them. Upon failure of a first host and copying of page tables of the virtual compute instance to the first host's memory, a first RDMA operation is performed to transfer the page tables from the first host's memory to the second host's memory. Then, second RDMA operations are performed to transfer data pages of the virtual compute instance from the first host's memory to the second host's memory, with references to memory locations of the data pages specified in the page tables. The page tables of the virtual compute instance are reconstructed to reference memory locations of the data pages in the second host's memory and stored therein.

CLUSTER WIDE REBUILD REDUCTION AGAINST STORAGE NODE FAILURES
20230013798 · 2023-01-19 ·

Systems, apparatuses and methods may provide for technology that detects a first failure in a first storage server, wherein the first storage server is connected to a first non-volatile memory (NVM) via a switch, selects a second storage server that is connected to the first NVM via the switch, wherein the first storage server and the second storage server are in a storage cluster, and configures the second storage server to host first data resident on the first NVM, wherein configuring the second storage server to host the first data bypasses a cluster-wide rebalance of the storage cluster.