G06F2209/505

RECONFIGURABLE COMPUTING PODS USING OPTICAL NETWORKS
20230161638 · 2023-05-25 ·

Methods, systems, and apparatus, including an apparatus for generating clusters of building blocks of compute nodes using an optical network. In one aspect, a method includes receiving request data specifying requested compute nodes for a computing workload. The request data specifies a target n-dimensional arrangement of the compute nodes. A selection is made, from a superpod that includes a set of building blocks that each include an m-dimensional arrangement of compute nodes, a subset of the building blocks that, when combined, match the target n-dimensional arrangement specified by the request data. The set of building blocks are connected to an optical network that includes one or more optical circuit switches. A workload cluster of compute nodes that includes the subset of the building blocks is generated. The generating includes configuring, for each dimension of the workload cluster, respective routing data for the one or more optical circuit switches.

PERFORMANCE TUNING IN A NETWORK SYSTEM

A container-based orchestration system includes a master node and a plurality of worker nodes. The master node can receive, from each agent executing on a corresponding worker node, node characteristics associated with the worker node. The master node can determine, for each worker node, one or more parameters corresponding to the node characteristics associated with the corresponding worker node and a node profile of the worker node and provide the parameters to the agent executing on the corresponding worker node. The agent configures the worker node in accordance with the parameters. In response to receiving a request to deploy a pod to a worker node, the master node can select a worker node to receive the pod based on the node characteristics and the pod characteristics. The agent can configure the selected worker node to execute workloads of the pod in accordance with the one or more parameters.

VEHICLE AS A DISTRIBUTED COMPUTING RESOURCE
20230161623 · 2023-05-25 ·

Distributed computing vehicles (e.g., using a computerized tool) are enabled. For example, a system can comprise a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory, wherein the computer executable components comprise: a request component that determines a compute request received via a network from a network device registered to use the system, and a resource component that, in response to a compute criterion associated with a vehicle communicatively coupled to the network being determined to be satisfied, allocates at least some compute resources of the vehicle to the compute request.

Avoidance of Workload Duplication Among Split-Clusters
20230161633 · 2023-05-25 ·

A computer implemented method avoids workload duplication in a cluster environment. The computer identifies a state change among a set of cluster resources in a cluster of nodes. Responsive to identifying the state change, the computer predicts resource requirements for a queued workload. The computer determines a pre-assignment of the queued workload to a sub-cluster according to the resource requirements that were predicted for the queued workload. The computer marks the queued workload to indicate the pre-assignment to the sub-cluster.

Data processing pipeline error recovery

Techniques are disclosed for executing a data processing pipeline. The techniques may include receiving a job at a data pipeline queue, setting up one or more distributed processing environments, and allocating the job to one of the distributed processing environments. The techniques may further include receiving the allocated job at a job queue within the distributed processing environment, increasing a priority level of the job, and executing the job within the distributed processing environment. The techniques can further include providing a retry pipeline at the data processing pipeline, and re-executing the job at a stage following a failure of at least one of its components. The techniques may decrement the retry budget as the job is re-executed.

PROVISIONING OF PHYSICAL SERVERS THROUGH HARDWARE COMPOSITION
20230065444 · 2023-03-02 ·

This disclosure describes techniques that include provisioning compute nodes within a data center out of available pools of hardware. In one example, this disclosure describes a method that includes monitoring, by a computing system, a first workload executing on a first compute node, wherein the first compute node includes processing circuitry and first node secondary storage; monitoring, by the computing system, a second workload executing on a second cluster of compute nodes; expanding, by the computing system, the second cluster of compute nodes to include a second compute node that includes second node secondary storage; redeploying the processing circuitry included within the first compute node to the second compute node; and enabling, by the computing system, the second workload to continue executing on the second cluster of compute nodes including the second compute node.

DYNAMICALLY PROVISIONING COMPUTING PODS IN A COMPUTING RESOURCE CLUSTER BASED ON A RESOURCE REQUEST FROM A STORAGE MANAGER OF AN INFORMATION MANAGEMENT SYSTEM

An information management system includes a storage manager for managing backup and/or restore operations for one or more client computing devices. The storage manager may be in communication with a resource administrator of a computing resource cluster, wherein the resource administrator instantiates one or more computing pods using the computing resource cluster. The resource administrator may receive a request for computing resources from the storage manager and provision the computing pods based on the request. The resource administrator may then select a pre-configured container image from one or more pre-configured container images based on the computing resource request, wherein the pre-configured container image configures a computing pod to create secondary copies of primary data from a particular primary data source of the information management system. The resource administrator may then communicate a message to the storage manager informing the storage of the availability of the provisioned computing pods.

DYNAMIC RELOCATION OF PODS TO OPTIMIZE INTER-POD NETWORKING
20220334886 · 2022-10-20 ·

Systems and methods for dynamically relocating pods to optimize inter-pod networking efficiency are provided. The method comprises receiving and storing inter-pod traffic data for a plurality of pods. The plurality of pods includes a first pod, a second pod, and a third pod. The method further includes receiving and storing node resource availability data for each node of a plurality of nodes, generating a queue that sorts the plurality of pods by an amount of inter-pod traffic indicated by the inter-pod traffic data, generating a hash that maps one or more parameters to the plurality of nodes, selecting, based on the generated hash, a node of the plurality of nodes, and dynamically relocating a highest ranked pod of the plurality of pods from the generated queue to the selected node.

Protection of private data using an enclave cluster
11470065 · 2022-10-11 · ·

Systems and methods are disclosed for protecting data. An example method includes creating an outer cluster on one or more host machines coupled to a network. The outer cluster includes a plurality of outer nodes. The method also includes creating an enclave cluster on the outer cluster. The enclave cluster includes a plurality of inner nodes, and each inner node of the plurality of inner nodes executes within an enclave of the one or more host machines. The method further includes exposing an application programming interface (API) to the outer cluster, where invocation of the API causes at least one inner node of the enclave cluster to perform an operation on data. The method also includes performing, by an inner node of the enclave cluster, the operation on the data in response to invocation of the API by an outer node of the outer cluster.

Container-as-a-service (CAAS) controller for selecting a bare-metal machine of a private cloud for a cluster of a managed container service

Embodiments described herein are generally directed to a controller of a managed container service that facilitates selection among bare metal machines available within a private cloud. According to an example, a request is received by a Container-as-a-Service controller from a CaaS portal to create a cluster based at least in part on resources of a private cloud of a customer of a managed container service. An inventory of bare-metal machines available within the private cloud is received from a Bare-Metal-as-a-Service (BMaaS) provider associated with the private cloud. A particular bare metal machine is identified for the cluster by selecting among the available bare-metal machines based on cluster information associated with the request, the inventory, and a best fit algorithm configured in accordance with a policy established by the customer.