G06F2209/485

Scheduling workloads on a common set of resources by multiple schedulers operating independently

Workloads are scheduled on a common set of resources distributed across a cluster of hosts using at least two schedulers that operate independently. The resources include CPU, memory, network, and storage, and the workloads may be virtual objects, including VMs, and also operations including live migration of virtual objects, network file copy, reserving spare capacity for high availability restarts, and selecting hosts that are to go into maintenance mode. In addition, the at least two independent schedulers are assigned priorities such that the higher priority scheduler is executed to schedule workloads in its inventory on the common set of resources before the lower priority scheduler is executed to schedule workloads in its inventory on the common set of resources.

Dynamic sequencing of data partitions for optimizing memory utilization and performance of neural networks

Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.

Background task scheduling based on shared background bandwidth

Techniques for background task scheduling based on shared background bandwidth are described. A method for background task scheduling based on shared background bandwidth may include receiving a request to perform one or more background tasks on a storage server of a storage service in a provider network, determining a priority of each of the one or more background tasks, wherein each background task is associated with a size parameter and a temporal parameter, and wherein the priority of each of the one or more background tasks is based at least on its associated size parameter and temporal parameter, determining a task type associated with each background task, adding each background task to one of a plurality of task queues associated with different task types, wherein each task queue is associated with a bandwidth allocation, and scheduling the one or more background tasks to be performed based on their priority and the bandwidth allocation.

RESOURCE AND OPERATION MANAGEMENT ON A CLOUD PLATFORM

Various approaches are described to manage the execution of operations. Such operations may be performed without human intervention and may help maintain functionality of a cloud platform or client instances. In one aspect of the present approach, the number and/or type of automations starting in a given time frame may be limited to maintain an even or consistent distribution of resource usage. In a further aspect, the number and/or type of concurrent automations may be limited to a defined threshold to maintain an even or consistent distribution of resource usage.

Systems and methods for acquiring server resources at schedule time

Systems and methods are disclosed that acquire server resources at the time of scheduling an automated instance-related task, such as an instance migration task, and prior to starting the automated task (e.g., prior to determining scheduling conflicts, creating a change request, or creating a move context associated with starting the instance migration task). Advantageously, if acquiring the server resources fails, an orchestration server performing the automated task can simply retry acquiring the server resources, thus avoiding restarting the automated task and re-performing steps of the automated task, thus avoiding unnecessary overhead.

Systems, methods, and apparatuses for implementing a scheduler and workload manager with snapshot and resume functionality

In accordance with disclosed embodiments, there are provided systems, methods, and apparatuses for implementing a stateless, deterministic scheduler and work discovery system with interruption recovery. For instance, according to one embodiment, there is disclosed a system to implement a stateless scheduler service, in which the system includes: a processor and a memory to execute instructions at the system; a compute resource discovery engine to identify one or more computing resources available to execute workload tasks; a workload discovery engine to identify a plurality of workload tasks to be scheduled for execution; a cache to store information on behalf of the compute resource discovery engine and the workload discovery engine; a scheduler to request information from the cache specifying the one or more computing resources available to execute workload tasks and the plurality of workload tasks to be scheduled for execution; and further in which the scheduler is to schedule at least a portion of the plurality of workload tasks for execution via the one or more computing resources based on the information requested. Other related embodiments are disclosed.

Enhancing processing performance of a DNN module by bandwidth control of fabric interface

An exemplary computing environment having a DNN module can maintain one or more bandwidth throttling mechanisms. Illustratively, a first throttling mechanism can specify the number of cycles to wait between transactions on a cooperating fabric component (e.g., data bus). Illustratively, a second throttling mechanism can be a transaction count limiter that operatively sets a threshold of a number of transactions to be processed during a given transaction sequence and limits the number of transactions such as multiple transactions in flight to not exceed the set threshold. In an illustrative operation, in executing these two exemplary calculated throttling parameters, the average bandwidth usage and the peak bandwidth usage can be limited. Operatively, with this fabric bandwidth control, the processing units of the DNN are optimized to process data across each transaction cycle resulting in enhanced processing and lower power consumption.

Scheduling And Load-Balancing Replication-Based Migrations of Virtual Machines
20230315515 · 2023-10-05 ·

Aspects of the disclosure provide for ordering and scheduling operations for migrating virtual machines in parallel. A migration system can provide for ordering and scheduling operations to be performed when only a subset of virtual machines slated for migration can be migrated at a time. Aspects of the disclosure provide for migrating interdependent virtual machines. Interdependent virtual machines may at least partially rely on data generated by applications or services of other virtual machines in the group. A migration system schedules and orders migration cycles to reduce the down time of services implemented by the virtual machines during cut-over operations in the migrations of the virtual machines.

Flexible hardware for high throughput vector dequantization with dynamic vector length and codebook size

The performance of a neural network (NN) and/or deep neural network (DNN) can limited by the number of operations being performed as well as memory data management of a NN/DNN. Using vector quantization of neuron weight values, the processing of data by neurons can be optimize the number of operations as well as memory utilization to enhance the overall performance of a NN/DNN. Operatively, one or more contiguous segments of weight values can be converted into one or more vectors of arbitrary length and each of the one or more vectors can be assigned an index. The generated indexes can be stored in an exemplary vector quantization lookup table and retrieved by exemplary fast weight lookup hardware at run time on the fly as part of an exemplary data processing function of the NN as part of an inline de-quantization operation to obtain needed one or more neuron weight values.

METHODS AND SYSTEMS FOR IMAGE PROCESSING

A method for image processing is provided. The method may include: obtaining a plurality of frames, each of the plurality of frames comprising a plurality of pixels; determining, based on the plurality of frames, whether a current frame of the plurality of frames comprises a moving object; in response to determining that the current frame includes no moving object, obtaining a first count of frames, and generating a target image by superimposing the first count of frames; in response to determining that the current frame includes a moving object, obtaining a second count of frames, and generating the target image by superimposing the second count of frames.